Pre-requirements for machine learning projects

Kornel Dylski - December 21, 2020

Machine learning projects are currently among the most exciting ways of using computing power to perform specific, clearly defined tasks. If your goal is to create a narrow, one-purpose program that consistently and efficiently yields great results, machine learning may be the right choice.

But before you start, make sure your goal is defined very precisely in the first place. Here are a couple of considerations and important pre-requirements for starting your first machine learning project.

Do you really need to create a new project?

Machine learning is gaining popularity in specific applications, but it’s important to remember that not every project needs to leverage machine learning.

Machine learning offers great results in image analysis applications (object detection, recognition, classification) and handling tabular data in general (e.g. for optimization, predictions, identification of trends). It can be successfully applied in text analysis and inference from data too, but the results, albeit helpful, are less impressive than in image recognition and analysis.

Image analysis alone is actually a really broad category. It actually covers projects which don’t directly involve image processing at all. For example, if you can transform your data (any data actually, even unrelated to the “image” category), and represent it using a 2D colored picture – this would still fall under the “image analysis” category.

For example, audio can be easily transformed (through a process called Fourier Transformation) into a visual representation of the sound spectrum, which is nothing more than a 2D image. Such an image – just like any other image – can then be compared, classified and analysed using various machine learning models against many other similar images.

Does “image analysis” fit your purpose?

When using the phrase “image analysis” I typically mean the model’s ability to answer various questions about an image, ranging from the basic “what is in the picture?”, to the more sophisticated questions like:

  • Where is the person in the picture?
  • Is it winter in the picture?
  • In what style is this painting?
  • How fast the object in the picture is moving?

The list of questions can go on and on. But if your goal can roughly fit into any of these categories or questions, it would probably be more effective to adapt an existing model rather than trying to create something entirely from scratch.

Picture labelled as “Dog is running”

What are the pre-requirements for a machine learning project?

To decide whether what you’re trying to achieve fits an existing machine learning model, you will have to answer a number of questions first. This will help you find an existing model that you could adapt to your needs and save a lot of time in the process.

Take, for example, a project which enables object detection in real-time using a webcam.

  • What should your project return as an answer?
  • What is the list of objects/categories (e.g. all possible hand gestures) that will be detected?
  • Is the list of gestures finite or not?
  • Does object detection consider the “unknown” category?
  • Do you want to be able to recognize multiple objects simultaneously?

Based on these answers, a machine learning developer (or data scientist) can narrow down the available loss functions and models to a few that could be used in the project.

How adaptive should your model be?

In an ideal world, gesture recognition should be based on a high-quality, still photo of a hand showing only. There is only a single gesture shown against a clear background in well-lit conditions. But we all know real-life conditions never like that.

When designing the model, you need to consider many factors like camera image quality, skin and clothing colors, background, distance from the camera – as well as different times of day and lighting conditions. A well-trained model should take into account all these aspects and consistently offer great accuracy.

But if you’re not planning to use the app outside of your office, it’s a whole different story, and could save you a lot of work trying to factor in all the variables. If all you want to achieve is to make the model recognize the simple “thumb up” sign shown in front of the camera in your office, preparing for all other environments will be redundant?

What is the goal of the model?

Try to find a way to determine if your model is effective and what is its accuracy. This will involve defining a threshold for “passable” recognition. This will help you determine if it’s improving.

It’s important to create certain design metrics that will tell you that you are progressing. This can provide valuable feedback for improving your model and refining it so it better serves the purpose.

Collect and prepare data

For machine learning to work, you will need to feed it with data to train the model. The data set has to be adjusted to your needs, but carefully-selected data sets can be very helpful for the project.

Fortunately there are already plenty of free-to-use datasets. Amazon, Google, Kaggle, public universities, state offices and many other companies collect ready-to-use datasets. If you are lucky, you can just fetch images from search engines. Nevertheless, the dataset always has to be prepared for training. And do not forget to prepare a test set with examples of real usage.


If you’ve got any questions about machine learning or are considering starting an ambitious ML project that’s uncomfortably beyond your expertise, don’t hesitate to drop us a line. Our machine learning experts are ready to help (and really like challenges)

About the author

Kornel Dylski

Software Engineer

Kornel is a frontend engineer with several years of experience building robust web applications. Apart from web solutions, he participates in machine learning projects. He has always been interested in physics, which led him to explore artificial intelligence and programming languages such as Python.
His focus is on solving technical problems and providing data-driven solutions to clients' needs. He has a creative spirit and loves to make people laugh or smile while working together on complex issues.

Tempted to work
on something
as creative?

That’s all we do.

join nexocode

More articles

Find us on

Need help with implementing AI in your business?

Let's talk blue circle

This site uses cookies for analytical purposes.

Accept Privacy Policy

In the interests of your safety and to implement the principle of lawful, reliable and transparent processing of your personal data when using our services, we developed this document called the Privacy Policy. This document regulates the processing and protection of Users’ personal data in connection with their use of the Website and has been prepared by Nexocode.

To ensure the protection of Users' personal data, Nexocode applies appropriate organizational and technical solutions to prevent privacy breaches. Nexocode implements measures to ensure security at the level which ensures compliance with applicable Polish and European laws such as:

  1. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46/EC (General Data Protection Regulation) (published in the Official Journal of the European Union L 119, p 1); Act of 10 May 2018 on personal data protection (published in the Journal of Laws of 2018, item 1000);
  2. Act of 18 July 2002 on providing services by electronic means;
  3. Telecommunications Law of 16 July 2004.

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet.

1. Definitions

  1. User – a person that uses the Website, i.e. a natural person with full legal capacity, a legal person, or an organizational unit which is not a legal person to which specific provisions grant legal capacity.
  2. Nexocode – NEXOCODE sp. z o.o. with its registered office in Kraków, ul. Generała Henryka Kamieńskiego 51, 30-644 Kraków, entered into the Register of Entrepreneurs of the National Court Register kept by the District Court for Kraków-Śródmieście in Kraków, 11th Commercial Department of the National Court Register, under the KRS number: 0000686992, NIP: 6762533324.
  3. Website – website run by Nexocode, at the URL: nexocode.com whose content is available to authorized persons.
  4. Cookies – small files saved by the server on the User's computer, which the server can read when when the website is accessed from the computer.
  5. SSL protocol – a special standard for transmitting data on the Internet which unlike ordinary methods of data transmission encrypts data transmission.
  6. System log – the information that the User's computer transmits to the server which may contain various data (e.g. the user’s IP number), allowing to determine the approximate location where the connection came from.
  7. IP address – individual number which is usually assigned to every computer connected to the Internet. The IP number can be permanently associated with the computer (static) or assigned to a given connection (dynamic).
  8. GDPR – Regulation 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of individuals regarding the processing of personal data and onthe free transmission of such data, repealing Directive 95/46 / EC (General Data Protection Regulation).
  9. Personal data – information about an identified or identifiable natural person ("data subject"). An identifiable natural person is a person who can be directly or indirectly identified, in particular on the basis of identifiers such as name, identification number, location data, online identifiers or one or more specific factors determining the physical, physiological, genetic, mental, economic, cultural or social identity of a natural person.
  10. Processing – any operations performed on personal data, such as collecting, recording, storing, developing, modifying, sharing, and deleting, especially when performed in IT systems.

2. Cookies

The Website is secured by the SSL protocol, which provides secure data transmission on the Internet. The Website, in accordance with art. 173 of the Telecommunications Act of 16 July 2004 of the Republic of Poland, uses Cookies, i.e. data, in particular text files, stored on the User's end device.
Cookies are used to:

  1. improve user experience and facilitate navigation on the site;
  2. help to identify returning Users who access the website using the device on which Cookies were saved;
  3. creating statistics which help to understand how the Users use websites, which allows to improve their structure and content;
  4. adjusting the content of the Website pages to specific User’s preferences and optimizing the websites website experience to the each User's individual needs.

Cookies usually contain the name of the website from which they originate, their storage time on the end device and a unique number. On our Website, we use the following types of Cookies:

  • "Session" – cookie files stored on the User's end device until the Uses logs out, leaves the website or turns off the web browser;
  • "Persistent" – cookie files stored on the User's end device for the time specified in the Cookie file parameters or until they are deleted by the User;
  • "Performance" – cookies used specifically for gathering data on how visitors use a website to measure the performance of a website;
  • "Strictly necessary" – essential for browsing the website and using its features, such as accessing secure areas of the site;
  • "Functional" – cookies enabling remembering the settings selected by the User and personalizing the User interface;
  • "First-party" – cookies stored by the Website;
  • "Third-party" – cookies derived from a website other than the Website;
  • "Facebook cookies" – You should read Facebook cookies policy: https://www.facebook.com/policy/cookies
  • "Other Google cookies" – Refer to Google cookie policy: www.google.com/policies/technologies/types/

3. How System Logs work on the Website

User's activity on the Website, including the User’s Personal Data, is recorded in System Logs. The information collected in the Logs is processed primarily for purposes related to the provision of services, i.e. for the purposes of:

  • analytics – to improve the quality of services provided by us as part of the Website and adapt its functionalities to the needs of the Users. The legal basis for processing in this case is the legitimate interest of Nexocode consisting in analyzing Users' activities and their preferences;
  • fraud detection, identification and countering threats to stability and correct operation of the Website.

4. Cookie mechanism on the Website

Our site uses basic cookies that facilitate the use of its resources. Cookies contain useful information and are stored on the User's computer – our server can read them when connecting to this computer again. Most web browsers allow cookies to be stored on the User's end device by default. Each User can change their Cookie settings in the web browser settings menu: Google ChromeOpen the menu (click the three-dot icon in the upper right corner), Settings > Advanced. In the "Privacy and security" section, click the Content Settings button. In the "Cookies and site date" section you can change the following Cookie settings:

  • Deleting cookies,
  • Blocking cookies by default,
  • Default permission for cookies,
  • Saving Cookies and website data by default and clearing them when the browser is closed,
  • Specifying exceptions for Cookies for specific websites or domains

Internet Explorer 6.0 and 7.0
From the browser menu (upper right corner): Tools > Internet Options > Privacy, click the Sites button. Use the slider to set the desired level, confirm the change with the OK button.

Mozilla Firefox
browser menu: Tools > Options > Privacy and security. Activate the “Custom” field. From there, you can check a relevant field to decide whether or not to accept cookies.

Opera
Open the browser’s settings menu: Go to the Advanced section > Site Settings > Cookies and site data. From there, adjust the setting: Allow sites to save and read cookie data

Safari
In the Safari drop-down menu, select Preferences and click the Security icon.From there, select the desired security level in the "Accept cookies" area.

Disabling Cookies in your browser does not deprive you of access to the resources of the Website. Web browsers, by default, allow storing Cookies on the User's end device. Website Users can freely adjust cookie settings. The web browser allows you to delete cookies. It is also possible to automatically block cookies. Detailed information on this subject is provided in the help or documentation of the specific web browser used by the User. The User can decide not to receive Cookies by changing browser settings. However, disabling Cookies necessary for authentication, security or remembering User preferences may impact user experience, or even make the Website unusable.

5. Additional information

External links may be placed on the Website enabling Users to directly reach other website. Also, while using the Website, cookies may also be placed on the User’s device from other entities, in particular from third parties such as Google, in order to enable the use the functionalities of the Website integrated with these third parties. Each of such providers sets out the rules for the use of cookies in their privacy policy, so for security reasons we recommend that you read the privacy policy document before using these pages. We reserve the right to change this privacy policy at any time by publishing an updated version on our Website. After making the change, the privacy policy will be published on the page with a new date. For more information on the conditions of providing services, in particular the rules of using the Website, contracting, as well as the conditions of accessing content and using the Website, please refer to the the Website’s Terms and Conditions.

Nexocode Team