Improving Information Extraction using a Probability-Based Approach

2064 Views
1277 Downloads
Export citation: ABNT
KIM, Sanghee ;WALLACE, Ken ;AHMED, Saeema .
Improving  Information Extraction using a Probability-Based Approach. 
Strojniški vestnik - Journal of Mechanical Engineering, [S.l.], v. 53, n.7-8, p. 429-441, november 2017. 
ISSN 0039-2480.
Available at: <https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/>. Date accessed: 20 dec. 2024. 
doi:http://dx.doi.org/.
Kim, S., Wallace, K., & Ahmed, S.
(2007).
Improving  Information Extraction using a Probability-Based Approach.
Strojniški vestnik - Journal of Mechanical Engineering, 53(7-8), 429-441.
doi:http://dx.doi.org/
@article{.,
	author = {Sanghee  Kim and Ken  Wallace and Saeema  Ahmed},
	title = {Improving  Information Extraction using a Probability-Based Approach},
	journal = {Strojniški vestnik - Journal of Mechanical Engineering},
	volume = {53},
	number = {7-8},
	year = {2007},
	keywords = {information searches; name entity identification; natural language processing; taxonomy; probability method; },
	abstract = {Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.},
	issn = {0039-2480},	pages = {429-441},	doi = {},
	url = {https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/}
}
Kim, S.,Wallace, K.,Ahmed, S.
2007 November 53. Improving  Information Extraction using a Probability-Based Approach. Strojniški vestnik - Journal of Mechanical Engineering. [Online] 53:7-8
%A Kim, Sanghee 
%A Wallace, Ken 
%A Ahmed, Saeema 
%D 2007
%T Improving  Information Extraction using a Probability-Based Approach
%B 2007
%9 information searches; name entity identification; natural language processing; taxonomy; probability method; 
%! Improving  Information Extraction using a Probability-Based Approach
%K information searches; name entity identification; natural language processing; taxonomy; probability method; 
%X Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
%U https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/
%0 Journal Article
%R 
%& 429
%P 13
%J Strojniški vestnik - Journal of Mechanical Engineering
%V 53
%N 7-8
%@ 0039-2480
%8 2017-11-03
%7 2017-11-03
Kim, Sanghee, Ken  Wallace, & Saeema  Ahmed.
"Improving  Information Extraction using a Probability-Based Approach." Strojniški vestnik - Journal of Mechanical Engineering [Online], 53.7-8 (2007): 429-441. Web.  20 Dec. 2024
TY  - JOUR
AU  - Kim, Sanghee 
AU  - Wallace, Ken 
AU  - Ahmed, Saeema 
PY  - 2007
TI  - Improving  Information Extraction using a Probability-Based Approach
JF  - Strojniški vestnik - Journal of Mechanical Engineering
DO  - 
KW  - information searches; name entity identification; natural language processing; taxonomy; probability method; 
N2  - Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
UR  - https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/
@article{{}{.},
	author = {Kim, S., Wallace, K., Ahmed, S.},
	title = {Improving  Information Extraction using a Probability-Based Approach},
	journal = {Strojniški vestnik - Journal of Mechanical Engineering},
	volume = {53},
	number = {7-8},
	year = {2007},
	doi = {},
	url = {https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/}
}
TY  - JOUR
AU  - Kim, Sanghee 
AU  - Wallace, Ken 
AU  - Ahmed, Saeema 
PY  - 2017/11/03
TI  - Improving  Information Extraction using a Probability-Based Approach
JF  - Strojniški vestnik - Journal of Mechanical Engineering; Vol 53, No 7-8 (2007): Strojniški vestnik - Journal of Mechanical Engineering
DO  - 
KW  - information searches, name entity identification, natural language processing, taxonomy, probability method, 
N2  - Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
UR  - https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/
Kim, Sanghee, Wallace, Ken, AND Ahmed, Saeema.
"Improving  Information Extraction using a Probability-Based Approach" Strojniški vestnik - Journal of Mechanical Engineering [Online], Volume 53 Number 7-8 (03 November 2017)

Authors

Affiliations

  • University of Cambridge, Department of Engineering, UK
  • University of Cambridge, Department of Engineering, UK
  • Technical University of Denmark, Department of Mechanical Engineering, Denmark

Paper's information

Strojniški vestnik - Journal of Mechanical Engineering 53(2007)7-8, 429-441
© The Authors, CC-BY 4.0 Int. Change in copyright policy from 2022, Jan 1st.

Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.

information searches; name entity identification; natural language processing; taxonomy; probability method;