Arushi Melanoma

Arushi Raghuvanshi

IMAGE PROCESSING AND MACHINE

LEARNING FOR THE DIAGNOSIS OF

MELANOMA CANCER

In the United States in 2010 there were only about 70,000 new cases of melanoma cancer but 10,000 deaths, making it one of the deadliest types of cancer.

Melanoma cancer can be cured.

Then why so deadly?

The biggest problem with melanoma is that it is not diagnosed early enough.

I created a computer program for at home use to help with the early diagnosis of melanoma cancer.

AbstractMelanoma cancer is one of the most dangerous and potentially deadly types of skin

cancer; however, if diagnosed early, it is nearly one-hundred percent curable

[UnderstMel09]. Here I propose an efficient system which helps with the early

diagnosis of melanoma cancer. Different image processing techniques and machine

learning algorithms are evaluated to distinguish between cancerous and non-

cancerous moles. Two image feature databases were created: one compiled from a

dermatologist-training tool for melanoma from Hosei University and the other

created by extracting features from digital pictures of lesions using a software called

Skinseg. I then applied various machine learning techniques on the image feature

database using a Python-based tool called Orange. The experiments suggest that

among the methods tested, the combination of Bayes machine learning with Hosei

image feature extraction is the best method for detecting cancerous moles. Then,

using this method, a computer tool was developed to return the probability that an

image is cancerous. This is a very practical application as it allows for at-home

findings of the probability that a mole is cancerous. This does not replace visits to a

doctor, but provides early information that allows people to be proactive in the

diagnosis of melanoma cancer.

Current Methods of DiagnosisCurrently, the initial diagnosis of melanoma is based on manual inspection by a trained dermatologist using what is known as the ABCDE method. If there is a high probability that a mole is cancerous from this manual inspection, then the dermatologist performs an excisional biopsy and looks at it at the microscopic level to distinguish if the mole is malignant or benign. If the mole is malignant, the surgeon performs a Sentinel Lymph Node or SLN biopsy. If melanoma is still present in the body after this step, the surgeon may remove nearby lymph nodes to keep the cancer from spreading farther.

The ABCDE Method

Source: The Ear, Nose, and Throat Alliance: http://www.allianceent.net/index.php?section=3&pid=198

The ABCDE method is the current method of manual inspection used by dermatologists to distinguish between moles that are cancerous and benign. ABCDE is an acronym for the following: Asymmetry, Boarder Irregularity, Color, Diameter, Evolving. When a mole is asymmetrical or the boarders are uneven, it is more likely to be malignant. Similarly, if it has two or more colors, a large diameter, or evolves over time, there is a high probability that the mole is cancerous. The diagnosis of melanoma is not based on just one of these factors but a combination of all of them. My project uses these image attributes as well as others to automate the manual inspection process.

● Image acquisition: I first need to acquire images of moles that may be cancerous. For this project, I used images taken from a standard household digital camera in order to make the tool available for at-home use. I received a database of images online and from a dermatologist.

● Feature extraction: Feature extraction is deciding what features are important and extracting those features from an image. For this project, the important features corresponded with the ABCDE method for melanoma detection.

● Machine learning: Machine learning algorithms can be used to determine trends based on the relationship between features found in the previous step. These trends can be used to predict whether a new image will be cancerous or not.

My Steps for Detecting Melanoma

Procedure (1)Step 1 Acquire Images

Acquire images of both cancerous and non-cancerous moles from dermatologists and from the internet

Step 2 Create feature database using different image feature extraction software tools

i) Skinseg tool - Skinseg segments a given image to isolate the portion of interest (i.e. the mole) and extracts a set of features from this segment.

After compiling a set of images, open each image individually, and segment it. If automatic segmentation doesn’t work, do semi-automatic or manual segmentation. Once the image is segmented, view the features and save them to a text file.

Procedure (2) Once all of the feature files are saved, use a python script to create a database of the following features:

■ Asymmetry■ Boundary irregularity■ Average RGB intensity■ Dominant RGB intensity■ Average HSI intensity

This creates a database of the features in a TAB delimited file. ii) Hosei tool: The Hosei Tool has predetermined features for given images. Most likely, these features were determined by physician inspection.

Using this dermatologist-training website, compile a set of the following features:

■ Symmetry ■ Borders■ Color■ Pigment Network■ Branched Steaks■ Homogenous■ Dots

This creates a database of the features in a TAB delimited file.

■ Dominant HSI intensity■ Entropy■ Energy■ Inertia■ Homogeneity

■ Globules■ Atypical Pigment■ Blue Whitesh Veil■ Atypical Vascular Pattern■ Irregular Streaks■ Irregular Pigmentation■ Regression Structures

Procedure (3)

Step 3. Machine Learning

The databases created in step 2 are now used for machine learning. There are various methods for machine learning. I worked with the following:

Majority Learning Bayes Learning Decision Trees kNN (Nearest Neighbor)

Evaluate the above methods using a Python based software tool called Orange. I wrote code in this program to test the percent accuracies of different sets of data for a given machine learning method and feature extraction method.

Image Acquisition ●Taken by normal digital camera●Acquired images from multiple

sources○Dr. Kristin Stevens, MD,

Dermatologist, Providence Medical Group, Portland, OR

○Dr. Sandhya Koppula, MD, Dermatologist, Cornell Dermatology Clinic, Beaverton, OR

○Internet

Feature Extraction Tools● Skinseg: Skinseg is a program developed by Wright State University. This

tool segments a given image and then extracts a set of features based from the segment.

● Hosei Tool: Created by Hosei University in Japan, this tool provides predetermined features for given images. Most likely, these features were determined by manual inspection by dermatologists.

● CVIP tools: This tool, developed by Southern Illinois University at Edwardsville, is very powerful but mostly interactive, so I did not use it for my project. It is possible to create a computer program which does this in a more automatic way, but I chose to use Skinseg and the Hosei tool instead. In the future, I plan to experiment more with this tool.

● Mole Expert Micro: This is a commercial software for the feature extraction of melanoma images. I was able to receive an evaluation version of this software for free from the founder of the company. Unfortunately, this software required a certain pixel per millimeter count which was not available for my images.

● Open CV: This tool from Intel would be very powerful in completely automating the process of feature extraction; however, it is not specifically for melanoma images. In the future, I plan to use Open CV or get the source code for Skinseg in order to completely automate the feature extraction process for a web-based melanoma diagnostic system.

Machine Learning Machine Learning Tools●I used Orange, a python-based tool, which had

libraries of multiple machine learning algorithms.● I wrote programs in Orange to test a variety of

different machine learning algorithms (listed below)

●MVSIS is another machine learning tool which I explored but didn’t test

Machine Learning Algorithms●Majority Learning

○ Majority learning, is a basic technique which gives a probability of a given mole being cancerous based on the distribution of cancerous and non-cancerous entries in the database.

●Bayes Learning○ In Bayes learning, Bayesian networks are created which represent the

relationship between a given feature and the probability that the mole is cancerous. Combined, these networks can give a probability for whether or not the mole is cancerous.

●Decision Trees○ This machine learning method creates a tree based on the training

data. There are a variety of different techniques to how to create the best tree and to distinguish which features are important and which are not.

●kNN (k-Nearest Neighbor)○ Nearest Neighbor is a machine learning method which creates a n-

dimensional space corresponding to n features. It then calculates the probability of being cancerous based on its k nearest neighbors where k is a selected value.

Sample Feature Database

Original image Segmented image

Extracted Features

Creating Feature DatabaseSave to text file with feature list and classification (melanoma- yes/no?)

Process Next Image More images to process?Ye

s

NoRun Python script (convert.py) providing all the feature text files as input and creating a TAB fileEach input file gets represented by a row in TAB file

skinsegdb.tab

User-Submitted Image

Segmented image

Extracted Features

Analyzing a User-Submitted Image

Save to text file with feature list

Run Python script in Orange for machine learning which applies various machine learning methods to provide a probability for whether or not the user-submitted image has a cancerous mole.

Decision Trees

Majority Learning

Bayes Learning

Nearest Neighbor

skinsegdb.tab

Submit .jpg

image of mole

Feature Extraction

Machine Learning

Web Page

Features

Diagnosis

Display Result

Proposed Web-based Melanoma Diagnostic System

Results

Machine Learning Algorithm Machine Learning Algorithm

Analysis & Conclusion● I concluded that using image processing and machine learning

tools I can create an effective algorithm to assist with the early diagnosis of melanoma cancer. This algorithm can be implemented in a website which can be used by people at home.

● The best learning method on my data set was using Bayes learning and the Hosei tool for feature extraction. This was different from my original hypothesis.

● The accuracy of classification was less than I predicted, but I only had about 130 images in my database. With more images, the accuracy will grow to a larger number. I did some experimentation and found that accuracy increases with the size of the database.

● I eliminated some variables from the machine learning database to help keep my results consistent. I eliminated number of pixels, perimeter, and area, because each image used a different scale. I also eliminated the file name because that had no effect on whether or not the image was cancerous.

● Next steps include automating the process, using a larger database of images, using a parallel computing architecture, such as CUDA, for faster computation, and creating an iPhone application

Acknowledgements● Dr. A. Goshtasby, Wright State University on Skinseg Image

processing tool● Dr. Scott E Umbaughs, Southern Illinois University,

Edwardsville on CVIP tools● Dr. Alan Mishchenko, University of California Berkeley for

MVSIS ● Mr. Holger Lüdtke, founder of MoleExpert for providing access

to an evaluation version of the MoleExpert Micro tool. ● Ms. Iris Cheng, University of California Berkeley for sharing

her research on image processing● Dr. B.J. Shrestha, Missouri University of Science and

Technology ● Dr. Kristin Stevens, MD, Dermatologist, Providence Medical

Group, Portland. Field expert for current methods of diagnosis; also provided images of melanoma

● Dr. Sandhya Koppula, MD, Dermatologist, Cornell Dermatology Clinic, Beaverton, for providing medical field expertise.

Arushi Melanoma

Documents

Transcript of Arushi Melanoma