Arushi Melanoma
-
Upload
jaimito-morales -
Category
Documents
-
view
245 -
download
1
Transcript of Arushi Melanoma
Arushi Raghuvanshi
IMAGE PROCESSING AND MACHINE
LEARNING FOR THE DIAGNOSIS OF
MELANOMA CANCER
In the United States in 2010 there were only about 70,000 new cases of melanoma cancer but 10,000 deaths, making it one of the deadliest types of cancer.
Melanoma cancer can be cured.
Then why so deadly?
The biggest problem with melanoma is that it is not diagnosed early enough.
I created a computer program for at home use to help with the early diagnosis of melanoma cancer.
AbstractMelanoma cancer is one of the most dangerous and potentially deadly types of skin
cancer; however, if diagnosed early, it is nearly one-hundred percent curable
[UnderstMel09]. Here I propose an efficient system which helps with the early
diagnosis of melanoma cancer. Different image processing techniques and machine
learning algorithms are evaluated to distinguish between cancerous and non-
cancerous moles. Two image feature databases were created: one compiled from a
dermatologist-training tool for melanoma from Hosei University and the other
created by extracting features from digital pictures of lesions using a software called
Skinseg. I then applied various machine learning techniques on the image feature
database using a Python-based tool called Orange. The experiments suggest that
among the methods tested, the combination of Bayes machine learning with Hosei
image feature extraction is the best method for detecting cancerous moles. Then,
using this method, a computer tool was developed to return the probability that an
image is cancerous. This is a very practical application as it allows for at-home
findings of the probability that a mole is cancerous. This does not replace visits to a
doctor, but provides early information that allows people to be proactive in the
diagnosis of melanoma cancer.
Current Methods of DiagnosisCurrently, the initial diagnosis of melanoma is based on manual inspection by a trained dermatologist using what is known as the ABCDE method. If there is a high probability that a mole is cancerous from this manual inspection, then the dermatologist performs an excisional biopsy and looks at it at the microscopic level to distinguish if the mole is malignant or benign. If the mole is malignant, the surgeon performs a Sentinel Lymph Node or SLN biopsy. If melanoma is still present in the body after this step, the surgeon may remove nearby lymph nodes to keep the cancer from spreading farther.
The ABCDE Method
Source: The Ear, Nose, and Throat Alliance: http://www.allianceent.net/index.php?section=3&pid=198
The ABCDE method is the current method of manual inspection used by dermatologists to distinguish between moles that are cancerous and benign. ABCDE is an acronym for the following: Asymmetry, Boarder Irregularity, Color, Diameter, Evolving. When a mole is asymmetrical or the boarders are uneven, it is more likely to be malignant. Similarly, if it has two or more colors, a large diameter, or evolves over time, there is a high probability that the mole is cancerous. The diagnosis of melanoma is not based on just one of these factors but a combination of all of them. My project uses these image attributes as well as others to automate the manual inspection process.
● Image acquisition: I first need to acquire images of moles that may be cancerous. For this project, I used images taken from a standard household digital camera in order to make the tool available for at-home use. I received a database of images online and from a dermatologist.
● Feature extraction: Feature extraction is deciding what features are important and extracting those features from an image. For this project, the important features corresponded with the ABCDE method for melanoma detection.
● Machine learning: Machine learning algorithms can be used to determine trends based on the relationship between features found in the previous step. These trends can be used to predict whether a new image will be cancerous or not.
My Steps for Detecting Melanoma
Procedure (1)Step 1 Acquire Images
Acquire images of both cancerous and non-cancerous moles from dermatologists and from the internet
Step 2 Create feature database using different image feature extraction software tools
i) Skinseg tool - Skinseg segments a given image to isolate the portion of interest (i.e. the mole) and extracts a set of features from this segment.
After compiling a set of images, open each image individually, and segment it. If automatic segmentation doesn’t work, do semi-automatic or manual segmentation. Once the image is segmented, view the features and save them to a text file.
Procedure (2) Once all of the feature files are saved, use a python script to create a database of the following features:
■ Asymmetry■ Boundary irregularity■ Average RGB intensity■ Dominant RGB intensity■ Average HSI intensity
This creates a database of the features in a TAB delimited file. ii) Hosei tool: The Hosei Tool has predetermined features for given images. Most likely, these features were determined by physician inspection.
Using this dermatologist-training website, compile a set of the following features:
■ Symmetry ■ Borders■ Color■ Pigment Network■ Branched Steaks■ Homogenous■ Dots
This creates a database of the features in a TAB delimited file.
■ Dominant HSI intensity■ Entropy■ Energy■ Inertia■ Homogeneity
■ Globules■ Atypical Pigment■ Blue Whitesh Veil■ Atypical Vascular Pattern■ Irregular Streaks■ Irregular Pigmentation■ Regression Structures
Procedure (3)
Step 3. Machine Learning
The databases created in step 2 are now used for machine learning. There are various methods for machine learning. I worked with the following:
Majority Learning Bayes Learning Decision Trees kNN (Nearest Neighbor)
Evaluate the above methods using a Python based software tool called Orange. I wrote code in this program to test the percent accuracies of different sets of data for a given machine learning method and feature extraction method.
Image Acquisition ●Taken by normal digital camera●Acquired images from multiple
sources○Dr. Kristin Stevens, MD,
Dermatologist, Providence Medical Group, Portland, OR
○Dr. Sandhya Koppula, MD, Dermatologist, Cornell Dermatology Clinic, Beaverton, OR
○Internet
Feature Extraction Tools● Skinseg: Skinseg is a program developed by Wright State University. This
tool segments a given image and then extracts a set of features based from the segment.
● Hosei Tool: Created by Hosei University in Japan, this tool provides predetermined features for given images. Most likely, these features were determined by manual inspection by dermatologists.
● CVIP tools: This tool, developed by Southern Illinois University at Edwardsville, is very powerful but mostly interactive, so I did not use it for my project. It is possible to create a computer program which does this in a more automatic way, but I chose to use Skinseg and the Hosei tool instead. In the future, I plan to experiment more with this tool.
● Mole Expert Micro: This is a commercial software for the feature extraction of melanoma images. I was able to receive an evaluation version of this software for free from the founder of the company. Unfortunately, this software required a certain pixel per millimeter count which was not available for my images.
● Open CV: This tool from Intel would be very powerful in completely automating the process of feature extraction; however, it is not specifically for melanoma images. In the future, I plan to use Open CV or get the source code for Skinseg in order to completely automate the feature extraction process for a web-based melanoma diagnostic system.
Machine Learning Machine Learning Tools●I used Orange, a python-based tool, which had
libraries of multiple machine learning algorithms.● I wrote programs in Orange to test a variety of
different machine learning algorithms (listed below)
●MVSIS is another machine learning tool which I explored but didn’t test
Machine Learning Algorithms●Majority Learning
○ Majority learning, is a basic technique which gives a probability of a given mole being cancerous based on the distribution of cancerous and non-cancerous entries in the database.
●Bayes Learning○ In Bayes learning, Bayesian networks are created which represent the
relationship between a given feature and the probability that the mole is cancerous. Combined, these networks can give a probability for whether or not the mole is cancerous.
●Decision Trees○ This machine learning method creates a tree based on the training
data. There are a variety of different techniques to how to create the best tree and to distinguish which features are important and which are not.
●kNN (k-Nearest Neighbor)○ Nearest Neighbor is a machine learning method which creates a n-
dimensional space corresponding to n features. It then calculates the probability of being cancerous based on its k nearest neighbors where k is a selected value.
Sample Feature Database
Original image Segmented image
Extracted Features
Creating Feature DatabaseSave to text file with feature list and classification (melanoma- yes/no?)
Process Next Image More images to process?Ye
s
NoRun Python script (convert.py) providing all the feature text files as input and creating a TAB fileEach input file gets represented by a row in TAB file
skinsegdb.tab
User-Submitted Image
Segmented image
Extracted Features
Analyzing a User-Submitted Image
Save to text file with feature list
Run Python script in Orange for machine learning which applies various machine learning methods to provide a probability for whether or not the user-submitted image has a cancerous mole.
Decision Trees
Majority Learning
Bayes Learning
Nearest Neighbor
skinsegdb.tab
Submit .jpg
image of mole
Feature Extraction
Machine Learning
Web Page
Features
Diagnosis
Display Result
Proposed Web-based Melanoma Diagnostic System
Results
Machine Learning Algorithm Machine Learning Algorithm
Analysis & Conclusion● I concluded that using image processing and machine learning
tools I can create an effective algorithm to assist with the early diagnosis of melanoma cancer. This algorithm can be implemented in a website which can be used by people at home.
● The best learning method on my data set was using Bayes learning and the Hosei tool for feature extraction. This was different from my original hypothesis.
● The accuracy of classification was less than I predicted, but I only had about 130 images in my database. With more images, the accuracy will grow to a larger number. I did some experimentation and found that accuracy increases with the size of the database.
● I eliminated some variables from the machine learning database to help keep my results consistent. I eliminated number of pixels, perimeter, and area, because each image used a different scale. I also eliminated the file name because that had no effect on whether or not the image was cancerous.
● Next steps include automating the process, using a larger database of images, using a parallel computing architecture, such as CUDA, for faster computation, and creating an iPhone application
Acknowledgements● Dr. A. Goshtasby, Wright State University on Skinseg Image
processing tool● Dr. Scott E Umbaughs, Southern Illinois University,
Edwardsville on CVIP tools● Dr. Alan Mishchenko, University of California Berkeley for
MVSIS ● Mr. Holger Lüdtke, founder of MoleExpert for providing access
to an evaluation version of the MoleExpert Micro tool. ● Ms. Iris Cheng, University of California Berkeley for sharing
her research on image processing● Dr. B.J. Shrestha, Missouri University of Science and
Technology ● Dr. Kristin Stevens, MD, Dermatologist, Providence Medical
Group, Portland. Field expert for current methods of diagnosis; also provided images of melanoma
● Dr. Sandhya Koppula, MD, Dermatologist, Cornell Dermatology Clinic, Beaverton, for providing medical field expertise.