Data and information can be ‘Captured’ from any source, whether paper or digital, structured or unstructured or by physical observations. It typically consists of large numbers of individual documents where specific sections need to be identified, captured and presented for further analysis, addition to a current database system or prepared in a list, or in the case of a physical observation, instant analysis of ‘people movement’ activity in a specific area.
Datanology have previously been tasked with ‘Capturing’ data for clients, with varied requirements.
How could it work for you ?
One of the most underestimated and under-utilised, yet valuable assets of a company is the data it holds and uses in it’s everyday running of the business. The more accessible, clean and structured that data is the higher the value to a company.
If you’ve got thousands of paper based files stored in cupboards, drawers or in the corner it’s taking up valuable space and it’s not working for you. Our OCR & IDT scanning, indexing and data capture process could turn this information into valuable data and free up space very quickly.
Even digital files stored on your hard drives, memory sticks etc may not be giving you the full value of the information you hold. Our bespoke systems can capture the data you need to use in your business, streamlining the information for instant location and access.
Complex spreadsheets can contain hundreds of thousands, even millions of pieces of valuable data spread across different worksheets and different files. Our data capture in Excel is extremely quick and we use a variety of processes to capture the data, blend it and automate it.
How do we do it ?
Manual keying: This process usually involves low volume, less regular data capturing such as leaflet based marketing or feedback responses, hand-written contact or product sheets.
IDT (Intelligent data trawling): Used in large volume digital file data such as Word, PDF or Rich text documents where a rules based system is created to analyse the data and capture specific keywords, keyphrases, or structured entities such as phone numbers, email addresses or postcodes.
OCR (Optical Character Recognition): Is used when the data to be captured is in printed paper format and is usually followed up by any other process such as Intelligent data trawling or ‘grabbing’ or a mixture of both.
Grabbing: This process has proved highly effective where the data source lacks structure and is high volume. It is typically used with other processes including OCR and IDT to maximise the result as well as providing a very quick, effective and accurate output. ‘Grabbing’ is used to ensure a high degree of accuracy in filtering out the exact data required. An example of this is the extraction of full names, email addresses, phone numbers and postal addresses that IDT may provide less accurate results on due to the complexity of identifying names, street names etc in a large set of data.
Other technology: We also use a variety of different software solutions to convert documents between formats such as PDF, Word, .txt, .csv, html, where necessary to ensure a quick and speedy capture of data.
Our data capture experts comprise of a team of varied experience, including Microsoft Excel Developers who are proficient in data analysis and bulk data capture through VBA programming as well as experienced in knowing how the structure of the data can be managed to enable an efficient capture.
Several thousand CV’s…
…in a mixture of Word, PDF and Rich Text format were received by a company that needed these CV’s indexed for their ongoing recruitment process. Our Data Capture process meant that the extraction of the person’s full name, email address, telephone number and postcode could be completed quickly, allowing a fully indexed system to be created. Further data capture was then required to tag specific CV’s that met a number of keyword criteria’s for the necessary skills, availability and experience of the job seeker.
Multiple product and pricing catalogues…
Location activity analysis (capturing physical location data)…
Leaflet marketing & feedback responses…
Paper based contacts collection…