데이터 이산화와 러프 근사화 기술에 기반한 중요 임상검사항목의 추출방법 : 담낭 및 담석증 질환의 감별진단에의 응용
- Author(s)
- 손창식; 김민수; 서석태; 조윤경; 김윤년
- Keimyung Author(s)
- Cho, Yun Kyeong; Kim, Yoon Nyun
- Department
- Dept. of Internal Medicine (내과학)
- Journal Title
- 의공학회지
- Issued Date
- 2011
- Volume
- 32
- Issue
- 2
- Keyword
- Data Discretization; Rought Set; Cholecystitis; Cholelithiasis; Differential Diagnosis
- Abstract
- The selection of meaningful clinical tests and its reference values from a high-dimensional clinical data with imbalanced class distribution, one class is represented by a large number of examples while the other is represented by only a few, is an important issue for differential diagnosis between similar diseases, but difficult. For this purpose, this study introduces methods based on the concepts of both discernibility matrix and function in rough set theory (RST) with two discretization approaches, equal width and frequency discretization. Here these discretization approaches are used to define the reference values for clinical tests, and the discernibility matrix and function are used to extract a subset of significant clinical tests from the translated nominal attribute values. To show its applicability in the differential diagnosis problem, we have applied it to extract the significant clinical tests and its reference values between normal (N = 351) and abnormal group (N = 101) with either cholecystitis or cholelithiasis disease. In addition, we investigated not only the selected significant clinical tests and the variations of its reference values, but also the average predictive accuracies on four evaluation criteria, i.e., accuracy, sensitivity, specificity, and geometric mean, during l0-fold cross validation. From the experimental results, we confirmed that two discretization approaches based rough set approximation methods with relative frequency give better results than those with absolute frequency, in the evaluation criteria (i.e., average geometric mean). Thus it shows that the prediction model using relative frequency can be used effectively in classification and prediction problems of the clinical data with imbalanced class distribution.
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.