Automated analysis of feedback in various formats of data : search for methods and tools to extract insights and information
Huotari, Harri (2017)
Huotari, Harri
Tampereen ammattikorkeakoulu
2017
All rights reserved
Julkaisun pysyvä osoite on
https://urn.fi/URN:NBN:fi:amk-201702082165
https://urn.fi/URN:NBN:fi:amk-201702082165
Tiivistelmä
Jaxber is a cloud-based tool which collects feedback, sharable information and ideas in various formats in targeted surveys or campaigns. The feedback is collected by a mobile application. The content can be text, videos, audio files and images in multitude of languages. Jaxber is owned, developed and marketed by Nestronite Oy company.
The purpose of this thesis was to find tools which can process all the content produced by Jaxber in as automated manner as possible. The urgency to shorten the time spent going through the feedback increases when the amount of material gets overloaded. The processing was to provide summaries, which highlight the essential information of the feedback for the customer of the campaign.
Potential methods and tools for the task were studied from the market mostly by browsing the offering of different IT companies and open source providers. The most promising ones were experimented with two sets of real data collected by Jaxber. Free trials of SW products and development environments were available in most cases for assessing suitability.
No single tool was found for the task. The most promising one for natural language processing was Bitext product. For extracting information from video files, Google Speech API is recommended, partly due its support for 80 languages. The process involves transforming the content first to audio files, and then transcribing it from speech to text. Google’s Vision API can be used for analyzing image content.
This is only the start. The trials prove that the technology is existing. Now it is time to develop necessary application for automating the processing of input files and utilizing cloud based services. Python as a programming language is recommended as it supports open source libraries for further natural language processing. For text analytics, the Bitext can be taken into use at any time, if the price is found as acceptable for the company.
The purpose of this thesis was to find tools which can process all the content produced by Jaxber in as automated manner as possible. The urgency to shorten the time spent going through the feedback increases when the amount of material gets overloaded. The processing was to provide summaries, which highlight the essential information of the feedback for the customer of the campaign.
Potential methods and tools for the task were studied from the market mostly by browsing the offering of different IT companies and open source providers. The most promising ones were experimented with two sets of real data collected by Jaxber. Free trials of SW products and development environments were available in most cases for assessing suitability.
No single tool was found for the task. The most promising one for natural language processing was Bitext product. For extracting information from video files, Google Speech API is recommended, partly due its support for 80 languages. The process involves transforming the content first to audio files, and then transcribing it from speech to text. Google’s Vision API can be used for analyzing image content.
This is only the start. The trials prove that the technology is existing. Now it is time to develop necessary application for automating the processing of input files and utilizing cloud based services. Python as a programming language is recommended as it supports open source libraries for further natural language processing. For text analytics, the Bitext can be taken into use at any time, if the price is found as acceptable for the company.