I-1: PROPOSAL TITLE

advertisement
PSU Research Proposal
Title: Health InformaticsUse of Medical Data Mining to
Enhance Service, Diagnosis, and
Reduce Costs
Department: Computer
Science
PI Name: Ahmed Sameh
Duration: 1 Year
Budget Est.: SR 55,000
Date: 12/20/2010
0
I - PROPOSAL
I-1: PROPOSAL TITLE (Provide a short descriptive title, give prominence to keywords)
Health Informatics- Use of Medical Data Mining to Enhance Service, Diagnosis, and
Reduce Costs
I-2: COMMERCIAL POTENTIAL
Yes
Could this project have commercial potential? (Select one)
No
 If yes, briefly elaborate on the commercial potential
Healthcare is significantly affected by technological advancements, as technology both shapes and changes
healthcare systems. As areas of computer science, information technology, and healthcare merge, it is
important to understand the current and future implications of health informatics. The area of data mining is a
new advancement in empirical research findings. Its application to healthcare informatics will reveal
implications and consequences both positive and negative of health informatics for ones and the society’s
health. As such, we can create commercial strategies for health based services/products to support new lines
of health aware businesses. For example, after producing negative correlation between the over use of
wireless mobile phones and health hazards on the brain, we can set up a commercial strategy that promotes
the product based on such hazard negative correlation (another example Skin creams and skin cancer). We
can also build a set of tools to be sold to Healthcare providers to reduce healthcare costs by analyzing
individuals’ healthcare data and generate deviation reports that describes access spending. A methodology
that was developed by one of the investigators has been enhanced and adapted to the Saudi environment.
I-3: CHECK-LIST
Have you checked to ensure all questions in the application form have been answered?
Have you checked to ensure you have included the correct costs in your budget?
The principal investigator and all co-principal investigators should sign.
I-4: PERSONNEL AND AUTHORIZATION
PRINCIPAL INVESTIGATOR [PI]
Academic Rank:
College:
Full Name: Ahmed Sameh
Department: Computer Science
CIS
Telephone:
Mobile:
Professor
494-8524
Ext:
0544299846
E-Mail: asameh@cis.psu.edu.sa
Signature:
CO- INVESTIGATOR(S) [CIs]
X8524
Date: 12/20/2010
(non-PSU CIs permitted)
1
1)
Full Name:
Academic Rank:
College:
Mohamed El-Affendi
Professor
E-Mail:
Department: Computer Science
CIS
Telephone:
Mobile:
Signature:
2)
Date:
Full Name:
Gregory Shapiro
E-Mail:
Department: Computer Science
Science & Engineering
Telephone:
Mobile:
Signature:
3)
Date:
College: CIS
/
/
Date:
/
/
Date:
/
/
E-Mail:
Mobile:
Signature:
Date:
Full Name: Ayman kassem (King Fahd University)
Academic Rank: Associate Professor
College:
Telephone:
E-Mail:
Department: Computer Science
Mobile:
Signature:
Full Name:
Academic Rank:
College:
Telephone:
E-Mail:
Department:
Mobile:
Signature:
6)
/
Department: Computer Science
Telephone:
5)
/
Full Name: Mohamed Tunsi
Academic Rank: Associate Professor
4)
/
(University of Massachusetts, Lowell, USA)
Academic Rank: Associate Professor
College:
/
Full Name:
Academic Rank:
College:
Telephone:
E-Mail:
Department:
Mobile:
2
Signature:
Date:
/
/
II - DESCRIPTION
II-1: ABSTRACT (Provide a statement of the project - maximum 200 words)
Health informatics (also called health care informatics, healthcare informatics, or medical informatics) is a
discipline at the intersection of information science, computer science, and health care. It deals with resources
and methods required to optimize the acquisition, storage, retrieval, and use of information in health and
medical research. It is applied to the areas of medical research, clinical care, dentistry, pharmacy, nursing,
and public health.
In the area of medical research: With analytical data mining the screening, diagnosis and detection of
diseases may get more efficient by reducing both time and costs for the corresponding procedures. It can also
be used to improve individuals’ medication by using patients’ medication history to promote specific drugs
directly to certain patients. It can be effective in providing low-cost screening using disease models that
require easily-obtained attributes from historical cases. It can perform automated analysis of pathological
signals (ECG, EEG, EMG), and medical images (MRI, CT, X-ray, and ultrasound). Data mining can also
produce more accurate results in the field of empirical medical research. For example, classification of
patterns of kidney stones in urine clustering,
Data mining can be used in improving clinical care: For example it can be used in healthcare management.
For example, time series analysis data mining algorithms can be used to predict (based on historical data)
patient volume per month, patient volume per medical specialization, length of stay for incoming patients per
medical department, ambulance run volume per month, and clinical decision support systems and information
workflows.
Data mining can be used in dentistry: It can produce dental and anatomical models for dentists. It can also be
used to improve dental management. It can be used effectively in dental marketing, and teledentistry
consultaion services. It can classify full crown and bridge plus all implants systems and cosmetic restorations.
Lastly, it can be used for analysis of X-ray of head and neck region. It can improve infection control, and
pharmaceuticals for dental use.
Data mining can be used in pharmacy: Classification and clustering algorithms can be used for supplements,
vitamins, and nutritional products grouping and recommendation. Association algorithms can be used to
discover relationships between medications. Data mining techniques can be used to enhance alternative
medicine, acupuncture and Chinese medicine, herbs by discovering correlative effects between these
alternative medicines and their corresponding chemical ones.
Data mining can be used in nursing, and public health care. It can discover better work needs for nursing
specialization. It can be used to study epidemics and the way they spread in poor communities.
Data mining can also be used to provide summary medical reports to hand-held portable devices to assist
providers with data entry/retrieval or medical decision making, sometime called mHealth.
Acquisition of medical data for data mining algorithms is quite a difficult task, specialy in Saudi Arabia.
Although most of healthcare and medical facilities in KSA collect large amounts of digital data, they are
hesitant to make this data available for research. As such the scope of this proposal is somewhat not very
specific due to this fact. In this project, we have some arrangements for collecting data that we hope will
eventually work. Depending on the type of data we can secure the scope of the project will focus on such
area.
As a starter we will explore only three of the above areas (management, medical research, and reducing
healthcare costs) until we stumble into a rich area with data, background knowledge, and specific
investigation queries in the other areas. It is not clear at the moment which area will open up for us.
Healthcare agencies in KSA are so reluctant to provide their own data, and background knowledge. For the
3
Healthcare management and medical research, we were able to get two Saudi data sets: Monthly patient
volume at “family Community Primary Healthcare Clinic of King Faisal University”, we used this data set to
forecast future volumes based on past data. We used time series analysis algorithm for prediction. The second
data set is urinary kidney stones from the Division of Urology, Department of Surgery, King Khalid
University Hospital. We used this data set to classify the samples by cluster analysis of ionic composition.
These two experiments gave good results and stand as positive indication that further data sets can be
acquired and utilized by this project. The third direction of investigation is “reducing healthcare costs” by
analyzing individuals’ data and discovering deviations that leads to higher costs. Deviation analysis is a data
mining technique that can discover also frauds and misuses. The proposed system is called “KEFIR: Key
Findings Reporter”.
II-2: PROJECT GOALS AND OBJECTIVES
The specific goals of this project are to demonstrate the power of data mining in using healthcare informatics
to enhance:
1 -Medical Applications: Screening, Diagnosis, Therapy, Prognosis, Monitoring, Biomedical/Biological
analysis, Epidemiological studies, Hospital management, Classifying uninary stones by Cluster Analysis of
ionic composition data, Efficient screening tools reduce demand on costly health care resources, Forecasting
ambulance run volume, Predicting length-of-stay for incoming patients, Diagnosis and classification: e.g.
ECG Interpretation: Using NN to predict which o/p: SV tachycardia, Ventricular tachycardia, LV
hypertrophy, RV hypertrophy, Myocardial infarction, Diagnosis and classification: assist in decision making
with a large number of inputs. E.g. can perform automated analysis of pathological signals (ECG, EEG,
EMG), medical images (Mammograms, Ultrasound, X-ray, CT, and MRI). E.g. Heart Attacks, Chest pains,
Rheumatic disorders, Myocardial ischemia using the ST-T ECG complex), Coronary artery disease using
SPECT images
2 –Patient medication: Medicine revolves on pattern recognition, classification, and prediction: Diagnosis:
recognize and classify patterns in multivariate patient attributes; Therapy: Select from available treatment
methods, based on effectiveness, suitability to patient; Prognosis: Predict future outcomes based on previous
experience and present conditions, Forecasting Patient Volume using uni-variant Time-Series Analysis,
Improving Classification of multiple dermatology disorders by problem decomposition
3-Modeling Obesity in Saudi Arabian youth, Modeling the educational score in Saudi school health surveys,
Better insight into medical survey data, Building disease models for the instruction and assessment of
undergraduate medical and nursing students, Epidemiological studies: Study of health, disease, morbidity,
injuries and mortality in human communities. E.g. Predict outbreaks in simulated populations. E.g. Assess
asthma strategies in inner-city children, Discover patterns relating outcomes to exposures, Study
independence or correlation between diseases, Detecting pathological conditions: e.g. tracking glucose levels,
Accurate prognosis (prediction) and risk assessment for improved disease management and outcome: e.g.
predict ambulation following spinal cord injury. E.g. Survival analysis for AIDS patients. Predict pre-term
birth risk, determine cardiac surgical risk
A separate direction of investigation is “reducing healthcare costs” by analyzing individuals’ data and
discovering deviations that leads to higher costs. Deviation analysis is a data mining technique that can
discover also frauds and misuses. The proposed system is called “KEFIR: Key Findings Reporter”.
Saudi Health Care Data:
The one of the biggest problems in this research is Where to get Saudi data from. The above list represents
possible tracks for the project. It all depends on the data that we will find. We might find partial incomplete
data that we should work on its preparation. How to prepare and pre-process data? Is it possible to make use
of non-Saudi data for proof of concept?
4
III - INTRODUCTION
III-1: REVIEW AND ANALYSIS OF RELATED WORK
Medical expert systems such as MYCIN and Internist were among the first computerized systems in
healthcare and medical applications. The use of data mining in healthcare informatics is a new direction of
research.
In Saudi Arabia, the Saudi Association for Health Information (SAHI) was established in 2006]to work under
direct supervision of King Saud University for Health Sciences to practice public activities, develop
theoretical and applicable knowledge, and provide scientific and applicable studies. SAHI is concerned with
use information in health care by clinicians. SAHI transform health care by analyzing, designing,
implementing, and evaluating information and communication systems that enhance individual and
population health outcomes, improve patient care, and strengthen the clinician-patient relationship. SAHI use
their knowledge of patient care combined with their understanding of informatics concepts, methods, and
health informatics tools to:




assess information and knowledge needs of health care professionals and patients,
characterize, evaluate, and refine clinical processes,
develop, implement, and refine clinical decision support systems, and
lead or participate in the procurement, customization, development, implementation, management,
evaluation, and continuous improvement of clinical information systems.
Physicians who are board-certified in clinical informatics collaborate with other health care and information
technology professionals to develop health informatics tools which promote patient care that is safe, efficient,
effective, timely, patient-centered, and equitable. The purpose of this project is to add to these health
informatics tools.
III-2: SIGNIFICANCE OF WORK
The significance of this project is that it deals with the field of healthcare. Health care is a very important and
wide field. The outputs of this project can be new results in medical research: With analytical data mining the
screening, diagnosis and detection of diseases may get more efficient by reducing both time and costs for the
corresponding procedures. It can also be used to improve individuals’ medication by using patients’
medication history to promote specific drugs directly to certain patients. It can be effective in providing lowcost screening using disease models that require easily-obtained attributes from historical cases. It can
perform automated analysis of pathological signals (ECG, EEG, EMG), and medical images (MRI, CT, Xray, and ultrasound). Also Data mining can produce more accurate results in the field of empirical medical
research. For example, classification of patterns of kidney stones in urine clustering,
The outputs of this project can be used in improving clinical care. For example, the results of applying Data
mining in healthcare management. For example, time series analysis data mining algorithms can be used to
predict (based on historical data) patient volume per month, patient volume per medical specialization, length
of stay for incoming patients per medical department, ambulance run volume per month, and clinical decision
support systems and information workflows.
The outputs of this project can also be used in dentistry, pharmacy, and nursing. For example, Data mining
can also be used to provide summary medical reports to hand-held portable devices to assist providers with
data entry/retrieval or medical decision making, sometime called mHealth. The outputs of this project can be
used for insurance fraud detection, infection control, and medical waste management. This direction of
investigation is “reducing healthcare costs” by analyzing individuals’ data and discovering deviations that
5
leads to higher costs. Deviation analysis is a data mining technique that can discover also frauds and misuses.
The proposed system is called “KEFIR: Key Findings Reporter”.
Acquisition of medical data for data mining algorithms is quite a difficult task, specialy in Saudi Arabia.
Although most of healthcare and medical facilities in KSA collect large amounts of data, they are hesitant to
make this data available for research. In this project, we have some arrangements for collecting data that we
hope will eventually work.
IV - APPROACH AND METHODOLOGY
IV-1: METHODOLOGY
There are many Data mining Methods to be applied to healthcare information such as: Time Series
Prediction, Classification, Clustering, and Association. Such algorithms can be applied to the various domain
in healthcare informatics:
1 -Medical Applications: Screening, Diagnosis, Therapy, Prognosis, Monitoring, Biomedical/Biological
analysis, Epidemiological studies, Hospital management. For example, forecasting Patient Volume using univariant Time-Series Analysis, Classifying uninary stones by Cluster Analysis of ionic composition data,
Optimize allocation of hospital resources, Forecasting ambulance run volume, Predicting length-of-stay for
incoming patients, Therapy: Based on modeled historical performance , select best intervention course: e.g.
best treatment plans in radiotherapy. E.g. Using patient model, predict optimum medication dosage; e.g. for
diabetics, Accurate prognosis (prediction) and risk assessment for improved disease management and
outcome: e.g. predict ambulation following spinal cord injury. E.g. Survival analysis for AIDS patients.
Predict pre-term birth risk, determine cardiac surgical risk, Diagnosis and classification: e.g. ECG
Interpretation: Using NN to predict which o/p: SV tachycardia, Ventricular tachycardia, LV hypertrophy, RV
hypertrophy, Myocardial infarction, Diagnosis and classification: assist in decision making with a large
number of inputs. E.g. can perform automated analysis of pathological signals (ECG, EEG, EMG), medical
images (Mammograms, Ultrasound, X-ray, CT, and MRI). E.g. Heart Attacks, Chest pains, Rheumatic
disorders, Myocardial ischemia using the ST-T ECG complex), Coronary artery disease using SPECT
images, and Risk assessment for improved disease management e.g. spinal cord injuries, and hart attacks
2- Modeling the educational score in Saudi school health surveys, Modeling Obesity in Saudi Arabian youth,
Epidemiological studies: Study of health, disease, morbidity, injuries and mortality in human communities.
E.g. Predict outbreaks in simulated populations. E.g. Assess asthma strategies in inner-city children
3-Better insight into medical survey data, effective Data Fusion from multiple sensors, Efficient screening
tools reduce demand on costly health care resources, Discover patterns relating outcomes to exposures, Study
independence or correlation between diseases, Detecting pathological conditions: e.g. tracking glucose levels,
Data fusion from various sensing modalities in ICUs to assist overburdened medical staff
For example, the figure shows the medical chart of a patient. Methods from the above three categories can be
applied to this chart to discover deviation measures. This direction of investigation is “reducing healthcare
costs” by analyzing individuals’ data and discovering deviations that leads to higher costs. Deviation analysis
is a data mining technique that can discover also frauds and misuses. The proposed system is called “KEFIR:
Key Findings Reporter” (see figure below).
6
7
The proposed system will apply deviation analysis techniques on individuals’ healthcare data to find
“interesting deviations”. The system will then augment these findings with plausible causes, and suggest
recommendations of appropriate actions. Each healthcare provider can apply the proposed system to its own
set of insured individuals. Each on possible medical areas covered: Inpatient, Outpatient, Surgical, Maternity,
etc. For each area, patient, the proposed system will apply “measures and formulas” to discover large
deviations from the norms. Or deviations from previous period and/or next period. Through models, and
formulas such deviations can be converted to costs.
Deliverables in phase I: Beta Version I + its Benchmark + its Tuning
Deliverables in Phase II: Beta Version II + its Benchmark + its Tuning
Deliverables in Phase III: Beta Version III + its Benchmark + its Tuning
Deliverables in Phase IV: Final Version + User Manual
The following is the project plan schedule. It represents those different tasks within the research and
estimated duration for each.
IV-2: AVAILABLE RESOURCES
Currently there are some open source data mining algorithms that can be used as tools in some of the above
investigations.
IV-3: EXPECTED RESULTS/OUTPUTS
Health care is a very important and wide field. The outputs of this project can be new results in medical
research: With analytical data mining the screening, diagnosis and detection of diseases may get more
efficient by reducing both time and costs for the corresponding procedures. It can also be used to improve
individuals’ medication by using patients’ medication history to promote specific drugs directly to certain
patients. It can be effective in providing low-cost screening using disease models that require easily-obtained
attributes from historical cases. It can perform automated analysis of pathological signals (ECG, EEG,
EMG), and medical images (MRI, CT, X-ray, and ultrasound). Also Data mining can produce more accurate
results in the field of empirical medical research. For example, classification of patterns of kidney stones in
urine clustering,
8
The outputs of this project can be used in improving clinical care. For example, the results of applying Data
mining in healthcare management. For example, time series analysis data mining algorithms can be used to
predict (based on historical data) patient volume per month, patient volume per medical specialization, length
of stay for incoming patients per medical department, ambulance run volume per month, and clinical decision
support systems and information workflows.
The outputs of this project can also be used in dentistry, pharmacy, and nursing. For example, Data mining
can also be used to provide summary medical reports to hand-held portable devices to assist providers with
data entry/retrieval or medical decision making, sometime called mHealth.
As a starter we will explore only two of the above areas (management, medical research) until we stumble
into a rich area with data, background knowledge, and specific investigation queries in the other areas. It is
not clear at the moment which area will open up for us. Healthcare agencies in KSA are so reluctant to
provide their own data, and background knowledge. For the Healthcare management and medical research,
we were able to get two Saudi data sets: Monthly patient volume at “family Community Primary Healthcare
Clinic of King Faisal University”, we used this data set to forecast future volumes based on past data. We
used time series analysis algorithm for prediction. The second data set is urinary kidney stones from the
Division of Urology, Department of Surgery, King Khalid University Hospital. We used this data set to
classify the samples by cluster analysis of ionic composition. These two experiments gave good results and
stand as positive indication that further data sets can be acquired and utilized by this project.
The following diagrams are results from the two data sets:
9
Methods above can be applied to individual patient’s charts to discover deviation measures. This direction of
investigation is “reducing healthcare costs” by analyzing individuals’ data and discovering deviations that
leads to higher costs. Deviation analysis is a data mining technique that can discover also frauds and misuses.
The proposed system is called “KEFIR: Key Findings Reporter”. The proposed system will apply deviation
analysis techniques on individuals’ healthcare data to find “interesting deviations”. The system will then
augment these findings with plausible causes, and suggest recommendations of appropriate actions. Each
healthcare provider can apply the proposed system to its own set of insured individuals. Each on possible
medical areas covered: Inpatient, Outpatient, Surgical, Maternity, etc. For each area, patient, the proposed
system will apply “measures and formulas” to discover large deviations from the norms. Or deviations from
previous period and/or next period. Through models, and formulas such deviations can be converted to costs.
V - REFERENCES
1- Saudi Ministry of Health http://www.moh.gov.sa/english/index.php
2- SAMIRAD http://www.saudinf.com/main/c6m.htm
10
VI - ROLE(S) OF THE INVESTIGATOR(S)
(Attach a brief CV for each investigator following the format in Appendix A)
#
Name of Investigator
Area of contribution to the project
1
Prof. Ahmed Sameh
System Design & Implementation
2
Prof. Mohamed El-Affendi
Data Collection & Preparation
3
Dr. Mohamed Tunsi
4
Dr. Gregory Shapiro
5
Dr. Ayman Kassem
Data Mining Tools
System Design & Implementation
Testing
6
VII - PROJECT SCHEDULE
PHASES OF PROJECT IMPLEMENTATION (SEE GANETT CHART ABOVE)
Steps
1
Task
System requirements specifications: Sameh, Tunsi
System Architecture : El-Affendi
System Design: Sameh, Greg
Databases Designs: Greg
Prototyping of critical sub-systems: Tunsi, Sameh
System Detailed Design: Sameh, Tunsi
Beta Version Implementation: Sameh, Ayman
Testing: El-Affendi
Building Deployment Environment: Sameh
Bench Marking and Collecting Results (First Round): Tunsi
System Tuning (Based on First Round Results): Sameh
Bench Marking and Collecting Results (Second Round Results): Al-Effendi
System Tuning (Based on Second Round Results): Sameh
Bench Marking and Collecting Results (Third Round Results): Tunsi
Version 1 Release:Tunsi
Results Documentation and Analysis with the Performance requirements:Sameh
Detailed Code Documentation: Sameh
User and Installation Guide (Full How To): Ayman
11
Duration
(Months)
See Gantt
Chart
within this
proposal
Total duration for the proposed project
12 Month
VIII - BUDGET OF THE PROPOSED RESEARCH (Budget in SAR)
Amount Priority 1 = Max;
Amount
Requested
2 = Mod; Approved
3 = Low.
(SAR)
(SAR)
Item
A. Personnel* (Research Assistant)
24,000
1
For
Official
Use
1- Student Ahmed Al-Jabreen
2- Student Kamal Qarawi
3- Student Omar Al-Moughnee
4- Student Amro Al-Munajjed
B. Equipment* (List)
5,000
1
5,000
2
1000
2
10,000
1
Development Server
C. Testing and Analysis* (Location/Laboratory)
Labtop Computer
D. Consumables* (List)
Desk Tools
E. Travel *(Local/Internat)
1- Travel for Gregory (Lowell Massachusetts /
Riyadh)
2- Travel for Ayman (Zahram / Riyadh)
12
F. Software* (List)
10,000
1
-SAS Data Mining Tools
-Oracle 9i Data Mining
-Clementines from SPSS
-Ants Model Builder
G. Other Items* (Itemize)
---
Total Amount Requested (SAR)
55,000
IX- JUSTIFICATION OF BUDGET (Justify each item listed in the budget in the previous section)
Item
A
Students Research
Assistants
B
Justification
Salary of SR 500 for each student for 12 months the duration of
the project.
For developing the proposed experiments.
Development Server
C
For on-site data collection and on-site testing
Laptop Computer
D
For general use by team members
Desk tools
E
For the two outside PSU team members.
Travel
13
F
Data Mining Tools Software
Software
G
X - RELEASE TIME FOR RESEARCH TEAM MEMBERS
RELEASE TIME FROM TEACHING LOAD
#
PI
CI1
CI2
CI3
CI4
Time Commitment
Team Member
(hrs/weeks/terms)
Ahmed Sameh
4 h/w
Mohamed El-Affendi
2h/w
Mohamed Tunsi
2h/w
Gregory Shapiro
1h/w
Ayman Kassem
1h/w
Teaching
Load Max
e.g. 1 course
FA11
CI5
XI - EXTERNAL FUNDING
#
1
Source of Funds
Amount (SAR)
None
2
3
14
Used for
…… costs
Appendix A: CV Format for Principal Investigator and Co-Investigators
(Two pages maximum, material should be related to submitted project)
Title and Name: Professor Ahmed Sameh
Specialty: Artificial Intelligence, Modeling and Information Systems
Department and College: Computer Science
Summary of Experience/Achievements Related to Research Proposal:
1- Ahmed Sameh, Ayman Kassem, “Lumbar Spine: Parameter Estimation for Realistic Modelling”, WSEAS
Transactions on Applied and Theoretical Mechanics, ISSN:1991-8747, Issue 5, Volume 2, May 2008
2- Ahmed Sameh, Ayman Kassem, “A General Framework for Lumbar Spine Modelling and Simulation”,
International Journal of Human Factors in Modelling and Simulation, IJHFMS, The North American Spine
Society, Volume 1, Issue 2, January 2008
3- Dalia El-Mansy, Ahmed Sameh, “A Collaborative Inter-Data Grid Strong Semantic Model with Hybrid
Namespaces”, Journal of Software (JSW), Academic Publisher, Volume 3, Issue 1, January 2008
4- Ahmed Sameh, “Simulating Lumbar Spine Motion”, Research in Computing Science (RCS) Journal,
National Polytechnic Institute of Mexico, ISSN 1665-9899, Volume 18, Issue 4, June 2007
5- Ahmed Sameh, and Ayman Kassem, “3D Modeling and Simulation of Lumbar Spine Dynamics”, in the
International Journal of Human Factors Modelling and Simulation , Volume IJHFMS-942, 2007
6-Adhami Louai, Abdel-Malek Karim, McGowan Dennis, Mohamed A. Sameh, "A Partial Surface/Volume
Match for High Accuracy Object Localization", International Journal of Machine Graphics and Vision, vol
10, no. 2, 2001
7-Mohamed A. Sameh, “Interactive Learning in Artificial Neural Networks Through Visualization”, The
International Journal of Computers and Applications (IJCA), Vol. 20, #2, 1998
8- Mohamed A. Sameh and Attia E. Emad, "Parallel 1D and 2D Vector Quantizers Using Kohonen SelfOrganizing Neural Network", in the International Journal of the Neural Computing and Applications, V.
(4), no. 2, Springer Verlag, London, 1996
9- Ahmed Sameh, Amgad Madkour, “Intelligent open Spaces: Learning User History Using Neural Network
for Future Prediction of Requested Resources”, Proceedings IEEE CSE'08, 11th IEEE International
Conference on Computational Science and Engineering, 16-18 July 2008, São Paulo, SP, Brazil. IEEE
Computer Society 2008, ISBN 978-0-7695-3193-9
10- Ahmed Sameh, Ayman Kaseem, “Modelling and Simulation of Human Lumbar Spine”, Proceedings of
the 2008 International Conference on Modelling, Simulation, and Visualization, MSV 2008, Las Vegas,
Nevada, July 14-17, 2008, CSREA Press 2008, ISBN 1-60132-081-7
11- Ahmed Sameh, Dalia El-Mansy, “A Collaborative Inter-Data Grids Model with Hybrid Namespace”, 14th
IEEE International Conference on Availability, Reliability, and Security, (DAWAM – ARES 2007), Vienna,
Austria, April 10-13, 2007
15
12- Ahmed Sameh, “Simulating Lumbar Spine Motion: Parameter Estimation for Realistic Modelling”, The
6th Mexican International Conference on Artificial Intelligence (MICAI07), Aguascalientes, Mexico,
November 4-10, 2007
13- Sherif Akoush, Ahmed Sameh, “Bayesian Learning of Neural Networks for Mobile User Position
Prediction”, The International Workshop on Performance Modelling and Evaluation in Computers and
telecommunication Networks (PMECT07)- part of the IEEE 16th International Conference on Computer
Communications and Networks, ICCCN 2007, Honolulu, Hawaii, August 13-16, 2007
14- Ahmed Sameh, “The Schlumberger High Performance Cluster at AUC”, Proceedings of the 13th
International Conference on Artificial Intelligence Applications, Cairo, February 4-6, 2005
15-Mohamed A. Sameh, Rehab El-Kharboutly, "Modeling a Service Discovery Bridge Using Rapide
Architecture Description Language", Proceedings of the 18th European Simulation Multiconference (ESM
2004), Magdeburg, Germany, June 13-16, 2004
16-Mohamed A. Sameh, Rehab El-Kharboutly, and Hazem Al-Ashmawy, "Modeling Wireless Discovery
and Deployment of Hybrid Multimedia N/W-Web Services Using Rapide ADL", Proceedings of the 7th
IEEE International Conference on High Speed N/Ws amd Multimedia Communications (HSNMC04),
Toulouse, France, June 30- July 2nd, 2004
17-Mohamed A. Sameh, Rhab El-Kharboutly, "Modeling Jini-UpnP Using Rapide ADL", Proceedings of the
10th EUROMEDIA Conference (EUROMEDIA 2004), Hasselt, Belgium, April 19-21, 2004
18-Mohamed A. Sameh, "E-Access Custom Webber: A Multi-Protocol Stream Controller", Proceedings of
the IADIS International Conference on Applied Computing, Lisbon, Portugal, March 23-26, 2004
19- Ayman Kassem, A. Sameh, and Tony Keller, “Modeling and Simulation of Lumbar Spine Dynamics”,
Proceedings of the 15th IASTED International Conference on Modeling and Simulation and Optimization
(MSO 2004), Marina Del Rey, California, March 2004
20-Mohamed A. Sameh, and Shenouda S., "Tera-Scale High Performance Distributed and Parallel SuperComputing at AUC", Proceedings of the 12th International Conference on Artificial Intelligence, Cairo, Feb.
18-20, 2004
21-Shenouda S., Mohamed L., and Mohamed A. Sameh, "AUC Cluster Participation in Global Grid
Communities", Proceedings of the 12th International Conference on Artificial Intelligence, Cairo, Feb. 1820, 2004
22-El-Ashmawi Hazem, and Mohamed A. Sameh, “XML-Socket Language-Independent Distributed Object
Computing Model”, Proceedings of the 15th International Conference on Parallel and Distributed
Computing Systems, Louisville, Kentucky, September, 2002
23-Mohamed Karasha, Greenshields Ian, and Mohamed A. Sameh, “HUSKY: A Multi-Agent Architecture
for Adaptive Scheduling of Grid Aware Applications”, Proceedings of the High Performance Computing
Symposium with the 2002 Advanced Simulation Technologies Conference (ASTC 2002), San Diego,
California, April 14-18, 2002
24-Atef Rania, Mohamed A. Sameh,and Abdel-Malek Karim, "Three Dimensional Deformable Modeling of
the Spinal Lumbar Region", Proceedings of the 11th International Conference on Intelligent Systems on
Emerging Technologies (ICIS-2002), Boston, July 18-20, 2002
25-Kassem Ayman, Mohamed A. Sameh, and Abdel-Malek Karim, "A Spring-Dashpot-String Element for
Modeling Spinal Column Dynamics", Proceedings of the International Workshop on Growth and Motion in
3D Medical Images, Copenhagen, Denmark, May 28- June 1, 2002
26-Kassem Ayman, and Mohamed A. Sameh, “A Fast Technique for modeling and Control of Dynamic
System”, Proceedings of the 11th International Conference on Intelligent Systems on Emerging Technologies
16
(ICIS-2002), Boston, July 18-20, 2002
27-Mohamed A. Sameh, and Kaptan Noha, "Anytime Algorithms for Maximal Constraint Satisfaction",
Proceedings of the ISCA 14th International Conference on Computer Applications in Industry and
Engineering (CAINE' 2001), Nov. 27- 29, at Las Vegas, Nevada, 2001
28-Mohamed A. Sameh, and Mansour Marwa "Enhancing Partitionable Group Membership Service in
Asynchronous Distributed Systems", Proceedings the ISCA 14th International Conference on Computer
Applications in Industry and Engineering (CAINE' 2001), Nov. 27- 29, at Las Vegas, Nevada, 2001
29-Abdalla Mahmoud, Mohamed A. Sameh, Harras Khalid, Darwich Tarek, "Optimizing TCP in a Cluster of
Low-End Linux Machines", Proceedings of the 3rd WSEAS Symposium on Mathematical Methods and
Computational Techniques in Electrical Engineering, Athens, Greece, Dec. 29-31, 2001
30-Rania Abdel Hamid, and Mohamed A. Sameh, “Visual Constraint Programming Environment for
Configuration Problems”, Proceedings of the 15th International Conference on Computers and their
Applications, New Orleans, Louisiana, March 2000
31-Essam A. Lotfy, and Mohamed A. Sameh, “Applying Neural Networks in Case-Based Reasoning
Adaptation for Cost Assessment of Steel Buildings”, Proceedings of the 10th International Conference on
Computing and Information, ICCI-2000, Kuwait, Nov. 18-21, 2000
32-Ghada A. Nasr, and Mohamed A. Sameh, “ Evolution of Recurrent Cascade Correlation Networks with a
Distributed Collaborative Species”, Proceedings of the IEEE Symposium on Computations of Evolutionary
Computation and Neural Networks, San Antonio, TX, May 2000
33-El-Beltagy S., Rafea A., and Mohamed A. Sameh, “An Agent Based Approach to Expert System
Explanation”, Proceedings of the 12th International FLAIRS Conference, Orlando, Florida, 1999
34- Mohamed A. Sameh, Botros A. Kamal, "2D and 3D Fractal Rendering and Animation", Proceedings of
the Seventh Eurographics Workshop on Computer Animation and Simulation, Aug. 31st- Sept. 2nd, in
Poitiers, France, 1996
35-Mohamed A. Sameh, "A Robust Vision System for three Dimensional Facial Shape Acquisition,
Recognition, and Understanding", Proceedings of the 1st Golden West International Conference on
Intelligent Systems, Reno, Nevada, 1991
36-Mohamed A. Sameh, "A Neural Trees Architecture for Fast Control of Motion", Proceedings of the
FLAIRS Artificial Intelligence Conference, Cocoa Beach, Florida, 1991
37-Mohamed A. Sameh, Armstrong W.W., "Towards a Computational Theory for Motion Understanding:
The Expert Animator Model", Proceedings of the 4th International Conference on Artificial Intelligence for
Space Applications, Nasa, Huntsville, Alabama, 1988
CV of Gregory Shapiro:
Gregory Piatetsky-Shapiro, Ph.D. is the President of KDnuggets, which provides
research and consulting services in the areas of data mining, knowledge discovery,
bioinformatics, and business analytics. Previously, he led data mining and consulting
groups at GTE Laboratories, Knowledge Stream Partners, and Xchange. He has extensive
experience developing CRM, customer attrition, cross-sell, segmentation and other models
for some of the leading banks, insurance companies, and telcos. He also worked on clinical
trial, microarray, and proteomic data analysis for several leading biotech and
pharmaceutical companies.
Gregory served as an expert witness and provided expert opinions in several cases.
17
Gregory is also the Editor and Publisher of KDnuggets News, the leading newsletter on
data mining and knowledge discovery (published since 1993), and the KDnuggets.com
website, (published since 1997) data mining community's top resource for data mining and
analytics software, jobs, solutions, courses, companies, and publications, and more. From
1994 to 1997, while at GTE Laboratories, he published Knowledge Discovery Nuggets
website, an earlier version of KDnuggets.
Gregory is the founder of Knowledge Discovery in Database (KDD) conferences. He
organized and chaired the first three Knowledge Discovery in Databases workshops in
1989, 1991, and 1993, and then chaired the KDD Steering Committee until 1998, when he
co-founded ACM SIGKDD, the leading professional organization for Knowledge
Discovery and Data Mining. He served as Director (1998 - 2005) and was elected
SIGKDD Chair (2005-2009 term).
Gregory has over 60 publications, including 2 best-selling books and several edited
collections on topics related to data mining and knowledge discovery, including SIGKDD
Explorations Special Issue on Microarray Data Mining (Vol 5, Issue 2, Dec 2003).
Gregory received ACM SIGKDD Service Award (2000) and IEEE ICDM Outstanding
Service Award (2007) for contributions to data mining field and community.
Publication Record:










Data Mining and Knowledge Discovery - 1996 to 2005: Overcoming the Hype and moving
from "University" to "Business" and "Analytics", Gregory Piatetsky-Shapiro, Data Mining
and Knowledge Discovery journal, 2007.
What Are The Grand Challenges for Data Mining? KDD-2006 Panel Report, Gregory
Piatetsky-Shapiro, Robert Grossman, Chabane Djeraba, Ronen Feldman, Lise Getoor,
Mohammed Zaki, KDD-06 Panel Report, SIGKDD Explorations, 8(2), Dec 2006.
10 Challenging Problems in Data Mining Research, Qiang Yang, Xindong Wu, Pedro
Domingos, Charles Elkan, Johannes Gehrke, Jiawei Han, David Heckerman, Daniel
Keim, Jiming Liu, David Madigan, Gregory Piatetsky-Shapiro, Vijay V. Raghavan,
Rajeev Rastogi, Salvatore J. Stolfo, Alexander Tuzhilin, and Benjamin W. Wah, Spec.
Issue of International Journal of Information Technology & Decision Making, Vol. 5, No. 4
(2006).
On Feature Selection through Clustering, R. Butterworth, G. Piatetsky-Shapiro, Dan A.
Simovici, Proceedings of IEEE ICDM-2005 Conference, Nov 2005.
A Comprehensive Microarray Data Generator to Map the Space of Classification and
Clustering Methods, Piatetsky-Shapiro, Gregory , and Grinstein, Georges G., Tech.
Report No. 2004-016, U. Massachusetts Lowell, 2004.
Microarray Data Mining: Facing the Challenges (PDF), Gregory Piatetsky-Shapiro and
Pablo Tamayo, SIGKDD Explorations, Dec 2003.
Capturing Best Practice for Microarray Gene Expression Data Analysis, G. PiatetskyShapiro, T. Khabaza, S. Ramaswamy, in Proceedings of KDD-2003 (ACM Conference
on Knowledge Discovery and Data Mining), Washington, D.C., 2003. (Honorary mention
for best application paper).
Measuring Real-Time Predictive Models (poster presentation), S. Steingold, R. Wherry, G.
Piatetsky-Shapiro, in Proceedings of IEEE ICDM-2001 Conference, San Jose, CA, Nov
2001.
Measuring Lift Quality in Database Marketing, (pdf, 100K), G. Piatetsky-Shapiro and S.
Steingold, SIGKDD Explorations, Dec 2000.
Knowledge Discovery in Databases: 10 years after, Gregory Piatetsky-Shapiro, SIGKDD
Explorations, Vol 1, No 2, Feb 2000.
18






















Expert Opinion: The data-mining industry coming of age (PDF), Gregory Piatetsky-Shapiro,
IEEE Intelligent Systems, Vol. 14, No. 6, November/December 1999.
Estimating Campaign Benefits and Modeling Lift, (MS Word) Gregory Piatetsky-Shapiro
and Brij Masand, Proceedings of KDD-99 Conference, ACM Press, 1999.
Knowledge Discovery and Acquisition from Imperfect Information, G. PiatetskyShapiro, chapter in A. Motro and P. Smets, eds., Uncertainty in Information
Management, Kluwer, 1997.
From Data Mining to Knowledge Discovery in Databases,
Usama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth. AI Magazine 17(3):
Fall 1996, 37-54
Mining Business Databases, Ron Brachman, Tom Khabaza, Willi Kloesgen, Gregory
Piatetsky-Shapiro, and Evangelos Simoudis, Communications of ACM, 39:11,
November 1996.
Data Mining and Knowledge Discovery in Databases: An overview, Usama M. Fayyad,
Gregory Piatetsky-Shapiro, Padhraic Smyth, Communications of ACM, 39:11,
November 1996.
Improving Classification Accuracy by Automatic Generation of Derived Fields Using
Genetic Programming, B. Masand and G. Piatetsky-Shapiro, in Advances in Genetic
Programming II, MIT Press, 1996.
An Overview of Issues in Developing Industrial Data Mining and Knowledge Discovery
Applications, Gregory Piatetsky-Shapiro, Ron Brachman, Tom Khabaza, Willi
Kloesgen, and Evangelos Simoudis, in KDD-96 Conference Proceedings, ed. E.
Simoudis, J. Han, and U. Fayyad, AAAI Press, 1996.
A Comparison of Approaches For Maximizing Business Payoff of Prediction Models,
Brij Masand and Gregory Piatetsky-Shapiro, in KDD-96 Conference Proceedings, ed. E.
Simoudis, J. Han, and U. Fayyad, AAAI Press, 1996.
Knowledge Discovery and Data Mining: Towards a Unifying Framework, Usama
Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth in KDD-96 Conference
Proceedings, ed. E. Simoudis, J. Han, and U. Fayyad, AAAI Press, 1996.
From Data Mining to Knowledge Discovery: an Overview, U. Fayyad, G. PiatetskyShapiro, P. Smyth, in Advances in Knowledge Discovery and Data Mining, AAAI/MIT
Press, 1996.
Selecting and Reporting What is Interesting: The KEFIR Application to Healthcare
Data, C. Matheus, G. Piatetsky-Shapiro, and D. McNeill, in Advances in Knowledge
Discovery and Data Mining, AAAI/MIT Press, 1996.
Knowledge Discovery in Personal Data vs. Privacy, G. Piatetsky-Shapiro, IEEE expert,
April 1995
KDD-93: Progress and Challenges in Knowledge Discovery in Databases (PDF, latex),
G. Piatetsky-Shapiro, C. Matheus, P. Smyth, R. Uthurusamy, AI magazine, 15(3): Fall
1994, 77-82.
The Interestingness of Deviations, G. Piatetsky-Shapiro, C. Matheus, in Proceedings
of KDD-94 workshop, AAAI Press, 1994.
Systems for Knowledge Discovery in Databases, C. Matheus, P. Chan, G. PiatetskyShapiro, IEEE Transactions on Data and Knowledge Engineering, 5(6), Dec. 1993.
Measuring Data Dependencies, G. Piatetsky-Shapiro and C. Matheus, in Proceedings
of AAAI-93 Workshop on KDD, AAAI Press Report WS-02.
"Knowledge Discovery in Databases - An Overview", W. Frawley, G. PiatetskyShapiro, C. Matheus, (PDF), in Knowledge Discovery in Databases 1991, pp. 1--30.
Reprinted in AI Magazine, Fall 1992.
Knowledge Discovery Workbench for Exploring Business Databases, G. PiatetskyShapiro and C. Matheus, in Int. J. of Intelligent Systems, 7(7), Sep 1992.
Report on AAAI91 workshop on Knowledge Discovery in Databases, G. PiatetskyShapiro, IEEE Expert, Fall 1991
"Discovery, Analysis, and Presentation of Strong Rules", G. Piatetsky-Shapiro (in
Knowledge Discovery in Databases 1991), pp. 229-248.
Knowledge Discovery in Real Databases: A workshop report (PDF, html), AI Magazine,
19
vol. 11, no. 5, January 1991.
Books and Proceedings








ACM TKDD Special Issue on Knowledge Discovery for Web Intelligence, Guest
Editors: Ning Zhong, Gregory Piatetsky-Shapiro, Yiyu Yao, Philip S. Yu, Dec 2010.
SIGKDD Explorations Special Issue on Microarray Data Mining, Vol. 5, Issue 2, Dec 2003.
R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, eds., Proceedings of KDD-98: 4th
International Conf. on Knowledge Discovery and Data Mining, AAAI Press, 1998.
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, R. Uthurusamy, eds., Advances in
Knowledge Discovery in Databases, AAAI/MIT Press 1996.
Mini-symposium on KDD vs. Privacy, IEEE Expert, April 1995. full text of a draft.
Special issue of J. of Intelligent Information Systems on Knowledge Discovery in
Databases, ed. G. Piatetsky-Shapiro, 4(1), Jan 1995.
KDD-93: Proceedings of AAAI-93 Workshop on KDD, ed. G. Piatetsky-Shapiro, AAAI
Press Report WS-02, 1993.
Gregory Piatetsky-Shapiro and William Frawley, eds., Knowledge Discovery in
Databases, AAAI/MIT Press, 1991
Appendix B: Evaluations and Approvals
COLLEGE REVIEW COMMITTEE Evaluation and Recommendation
Excellent
Item/ Evaluation
Very
Good
Good
Weak
Research methodology
Research objectives
Research originality
Research contribution
Research applicability and relevance
Overall evaluation
Recommendations of College Committee
Approved
Amount of Budget Approved by College Committee:
Chair College Committee - Title and Full Name:
20
(SAR)
Disapproved
Signature:
Date:
Recommendations of the College Council
/
Approved
/
Disapproved
Dean of the College Council - Title and Full Name
Signature:
Date:
/
/
PSU INSTITUTIONAL RESEARCH COMMITTEE (IRC) Recommendation
Recommendation of the PSU IRC
Approved
Disapproved
Chair IRC Committee - Title and Full Name:
Signature:
Date:
21
/
/
PSU EXTERNAL REVIEW PANEL FOR RESEARCH PROPOSALS Recommendation
Recommendation of the Eternal Review Committee.
Approved: Amount of grant approved:
Disapproved:
Postponed:
Directed to:
Chair of External Review Panel - Title and Full Name:
Signature:
( SAR)
Date:
Recommendation of University Council
/
/
Approved
Signature:
Date:
22
Disapproved
/
/
Download