Contact

Improving Information Extraction using a Probability-Based Approach

PDF

2175 Views

1350 Downloads

Export citation: ABNT

KIM, Sanghee ;WALLACE, Ken ;AHMED, Saeema .
Improving  Information Extraction using a Probability-Based Approach. 
Strojniški vestnik - Journal of Mechanical Engineering, [S.l.], v. 53, n.7-8, p. 429-441, november 2017. 
ISSN 0039-2480.
Available at: <https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/>. Date accessed: 18 apr. 2025. 
doi:http://dx.doi.org/.

Kim, S., Wallace, K., & Ahmed, S.
(2007).
Improving  Information Extraction using a Probability-Based Approach.
Strojniški vestnik - Journal of Mechanical Engineering, 53(7-8), 429-441.
doi:http://dx.doi.org/

@article{.,
	author = {Sanghee  Kim and Ken  Wallace and Saeema  Ahmed},
	title = {Improving  Information Extraction using a Probability-Based Approach},
	journal = {Strojniški vestnik - Journal of Mechanical Engineering},
	volume = {53},
	number = {7-8},
	year = {2007},
	keywords = {information searches; name entity identification; natural language processing; taxonomy; probability method; },
	abstract = {Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world  is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.},
	issn = {0039-2480},	pages = {429-441},	doi = {},
	url = {https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/}
}

Kim, S.,Wallace, K.,Ahmed, S.
2007 November 53. Improving  Information Extraction using a Probability-Based Approach. Strojniški vestnik - Journal of Mechanical Engineering. [Online] 53:7-8

%A Kim, Sanghee
%A Wallace, Ken
%A Ahmed, Saeema
%D 2007
%T Improving Information Extraction using a Probability-Based Approach
%B 2007
%9 information searches; name entity identification; natural language processing; taxonomy; probability method;
%! Improving Information Extraction using a Probability-Based Approach
%K information searches; name entity identification; natural language processing; taxonomy; probability method;
%X Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
%U https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/
%0 Journal Article
%R
%& 429
%P 13
%J Strojniški vestnik - Journal of Mechanical Engineering
%V 53
%N 7-8
%@ 0039-2480
%8 2017-11-03
%7 2017-11-03

Kim, Sanghee, Ken  Wallace, & Saeema  Ahmed.
"Improving  Information Extraction using a Probability-Based Approach." Strojniški vestnik - Journal of Mechanical Engineering [Online], 53.7-8 (2007): 429-441. Web.  18 Apr. 2025

TY - JOUR
AU - Kim, Sanghee
AU - Wallace, Ken
AU - Ahmed, Saeema
PY - 2007
TI - Improving Information Extraction using a Probability-Based Approach
JF - Strojniški vestnik - Journal of Mechanical Engineering
DO -
KW - information searches; name entity identification; natural language processing; taxonomy; probability method;
N2 - Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
UR - https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/

@article{{}{.},
	author = {Kim, S., Wallace, K., Ahmed, S.},
	title = {Improving  Information Extraction using a Probability-Based Approach},
	journal = {Strojniški vestnik - Journal of Mechanical Engineering},
	volume = {53},
	number = {7-8},
	year = {2007},
	doi = {},
	url = {https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/}
}

TY - JOUR
AU - Kim, Sanghee
AU - Wallace, Ken
AU - Ahmed, Saeema
PY - 2017/11/03
TI - Improving Information Extraction using a Probability-Based Approach
JF - Strojniški vestnik - Journal of Mechanical Engineering; Vol 53, No 7-8 (2007): Strojniški vestnik - Journal of Mechanical Engineering
DO -
KW - information searches, name entity identification, natural language processing, taxonomy, probability method,
N2 - Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.
UR - https://www.sv-jme.eu/article/improving-information-extraction-using-a-probability-based-approach/

Kim, Sanghee, Wallace, Ken, AND Ahmed, Saeema.
"Improving  Information Extraction using a Probability-Based Approach" Strojniški vestnik - Journal of Mechanical Engineering [Online], Volume 53 Number 7-8 (03 November 2017)

Authors

Sanghee Kim
Ken Wallace
Saeema Ahmed

Affiliations

University of Cambridge, Department of Engineering, UK
University of Cambridge, Department of Engineering, UK
Technical University of Denmark, Department of Mechanical Engineering, Denmark

Paper's information

Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, thee industrial world is now more transient and key personnel move to other companies or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A key-word-based research is commonly used, but studies have shown that these searches often prove unsuccessful. Searches can be improved if domain entities of interest e.g., 'gas turbine', are referring to the entities using various different ways of expressing them. It would be helpful to compile a full list of entities associatedwith the relevant types before identifying them in texts. However, due to the various ways of reffering entities in the texts, manually defined identification rules tend to produce high precision, a learning approach that makes pre-defined variations, looks promising. This paper presnets the results of developing such a probability-based entity-identification approach. Tests show that the porposed approach achieves imporved recall, i.e, from 53% to 80% with comparable precision.

information searches; name entity identification; natural language processing; taxonomy; probability method;

Improving Information Extraction using a Probability-Based Approach

Authors

Affiliations

Paper's information

Abstract

Keywords