We would provide the following:
FTP with a folder with:
About 165 csv files with apprx. 260,000 lines each and 9 columns. There are unique identifiers in Column A and information about these unique identifiers in the following columns. Each csv file’s name is a timestamp showing the exact date of the information in the file.
One additional xlsb file with unique identifiers in Column A, Date in Column B and additional information for the unique identifier in the following columns.
We need the following to be performed:
All csv files to be combined into one database, keeping the date of each unique identifier.
All unique identifiers with their dates in the xlsb file to be matched against the same unique identifiers from the same date in the csv files.
Once matched, the data from column B to I from the csv files to be pasted in the xlsb files on the unique identifier’s line from column E to column L.
Example:
The xlsb file contains in Column A, unique identifier 123456, in column B, on the same line is date 01/08/2016
We would need to find the unique identifier 123456 in the csv file from 01/08/2016 and then copy the information from column B to I on the same line as the unique identifier, and then paste it in the xlsb file, on the same line as the unique identifier in the same order, starting from column E.
Then, we would need the xlsb file with all of the above information and one file with the combined csv files.
The database with all of the combined files would not necessarily contain all of the fields in each of the csv files but to the very first csv file should be added all the changes in the information concerning each unique identifiers. If no change has occurred, the unique identifier's information doesn't need to be duplicated.
In order to get a better understanding of the scope of the job- the xlsb file contains about 40,000 lines.
We have about 165 csv file. Each of the csv files contains about 240,000 lines.
At the end, we will need the files and a guide how to perform this operation ourselves in the future.
Hi,
I have sufficient understanding of MySQL apart from a host of other programming languages and testing experience. If given an opportunity to work on this project, will definitely do a timely good job of it.
And thanks for the detailed explanation of the requirements.
2 clarifications from your below statement:
"The database with all of the combined files would not necessarily contain all of the fields in each of the csv files but to the very first csv file should be added all the changes in the information concerning each unique identifiers. If no change has occurred, the unique identifier's information doesn't need to be duplicated."
1. you meant that the combined database would not contain "all the rows" and not "all the field" - meaning one row for one unique identifier from across the 165 csv files. so if all the csv files had the same 260000 unique identifiers, our DB would also have 260000 records ?
2. The xlsb appears to be a subset of the unique identifiers from the csv files.
I hope we will be provided either the real or close to real files for coding and testing to confirm such assumptions.
Bidding for seven days to be safe. If all clarifications are received quickly, can complete before that too.
Regards,
Ramya
$30 USD за 7 дні(-в)
0,0 (1 відгук)
0,0
0,0
14 фрілансерів(-и) готові виконати цю роботу у середньому за $135 USD
Hi! For this price, I will try not only to solve what you need, but also an automatic software that can be configured and does the task anytime in the future. Thus, you will be able to repeat it whenever needed. It would be done in 2-3 days maximum.
We are glad to welcome ! Somo a company of IT services. Among the tools we use Talend ETL, Data Mining used . We can handle the data presented by you, and ademar provide routines, sources and advice so that you can use and improve the process.
Contact Us