What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

21
What data needs to be collected for a PhD in Machine Learning ? An Academic presentation by WHAT DATA NEEDS TO BE COLLECTED FOR A PHD IN MACHINE LEARNING? An Academic presentation by Dr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.com Email: [email protected]

description

A PhD in machine learning involves exploring and developing a precise subject matter among many machine learning subfields. In the AI industry, a PhD is appreciated as an outstanding achievement. Development in automated data analysis techniques and decision-making needs research work in machine learning algorithms and foundations, statistics, complexity theory, optimization, data mining, etc. This blog discusses the various data collection methods in the machine learning research field. Ph.D. Assistance serves as an external mentor to brainstorm your idea and translate that into a research model. Hiring a mentor or tutor is common and therefore let your research committee know about the same. We do not offer any writing services without the involvement of the researcher. Learn More: https://bit.ly/3uXw8b0 Contact Us: Website: https://www.phdassistance.com/ UK NO: +44–1143520021 India No: +91–4448137070 WhatsApp No: +91 91769 66446 Email: [email protected]

Transcript of What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Page 1: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

What data needs to becollected for a PhD in Machine Learning ?

An Academic presentation byDr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.comEmail: [email protected]

WHAT DATA NEEDS TO BE COLLECTED FOR A PHD IN MACHINE LEARNING?

An Academic presentation byDr. Nancy Agnes, Head, Technical Operations, Phdassistance Group www.phdassistance.comEmail: [email protected]

Page 2: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

In BriefIntroduction Data FindingTypes of data collection Tools for data collection Conclusion

Outline

TODAY'S DISCUSSION

Page 3: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

A PhD in machine learning involves exploring and developing a precise subject matter among many machine learning

subfields.In the AI industry, a PhD is appreciated as an outstanding achievement. Development in automated data analysis techniques and decision-making needs research work in machine learning algorithms

and foundations, statistics, complexity theory, optimization, data mining, etc. This blog discusses the

various data collection methods in the machine learning research field.

In Brief

Page 5: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

The data need to be developed for a rtificial intelligence (AI) and machine learning solutions.

It must be collected and stored in a way that solves the problem.

M achine learning is heavily used for business intelligence and analytics, effective web search, robotics, smart cities, and understanding the human genome.

But there is a significant challenge for society to use the vast quantities of stored data, and due to this, science and technology have to attain huge investment in computerization and data collection.

Page 6: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance
Page 7: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Data findings can be viewed as two steps

The created data must be indexed and published for sharing.

Some others can search the datasets fortheirmachine learning tasks.

Data Finding

Page 8: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

RESEARCH NEEDS

A PhD in machine learning involves exploring and developing a precise subject matter among many machine learning subfields.

In the AI industry, a PhD is appreciated as an outstanding achievement.

Development in the automated T echniques for Data Analysis and decision making needs research work in machine learning algorithms and foundations, statistics, complexity theory, optimization, data mining, etc.

Page 9: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Data can be considered into two kinds

STRUCTURED DATA

It refers to well-defined types of data stored in search-friendly databases such as dates, numbers, strings, etc.

UNSTRUCTURED DATA

It is everything can be collected-but not search-friendly, such as emails, Text files, Media files (music, videos, photos)

Types of data collection

Page 10: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

The aim is to discover datasets that are used totrain machine learning models.

There are broadly three approaches in the literature

Data Discovery is required when one needs to share or search for new datasets and become necessary and available on the Website and corporate data lakes.

Data Augmentation is counterparts data discovery that existing datasets are improved by adding additional data externally

Contd....

Data Acquisit ion

Page 11: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Data Generation is used when there is no available external dataset, but itcan generate crowdsourced or synthetic datasets instead.

The different methods are classified in Table 1.

Page 12: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance
Page 13: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance
Page 15: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

It describes the automated, programmatic usage of an application to mine data or performs the task that users would perform manually, like social media posts or images.

Tools to extract data from the web are

Contd....

Data Scraping Tools

Page 16: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Octoparse: A web scraping is a non-coding tool that used to get public data.

Mozenda: A tool that doesn't require any scripts or developers to extract unstructured web data

Synthetic Data Generator

This tool can also be generated by programs to get large sample sizes of data.

This data is used in training neural networks.

Contd....

Page 17: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Pydbgen: It is a Python library that is used to produce a vast synthetic databaseas stated by the user.

Mockaroo: It is a data generator tool that allows users to create or customCSV, SQL, JSOn and Excel datasets to test and trial software.

Contd....

Few tools for generating synthetic datasets are

Page 18: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Data augmentation, in some cases, is used to increase the size of anexisting dataset despite gathering additional data.

For example, an image dataset is augmented by cropping, rotating, or changing the original document's lighting effects.

OpenCV: In this Python library, image augmentation functions are available.

For example, features like bounding boxes, cropping, scaling, rotation, blur, filters, translation, and so on.

Contd....

Data Augmentation Tools

Page 19: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

scikit-image: This tool is also a c ollection of algorithms for image processing which are available for free of cost and restriction.

It also has provision to convert from one colour space to another space, erosion and dilation, resizing, rotating, filters, and so on.

Page 20: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

As machine learning becomes more widely used, it becomes more important to acquire large amounts of data and label data, especially for state-of-the-art neural networks.

If the current state of machine learning is available, the future of machine learning has high opportunities for technologists.

Some of the use evolving today that enlarge the future scope are:

Optimizing Operations Safer Healthcare

Fraud PreventionMass Personalization

Conclusion and Future Work

Page 21: What Data needs to be Collected for a PhD in Machine Learning ? - Phdassistance

Contact Us

UNITED KINGDOM+44-1143520021

INDIA+91-4448137070

[email protected]