Write some Software

В роботі Опубліковано %project.relative_time Оплачується при отриманні
В роботі Оплачується при отриманні

Hi!

I am looking to put something together where I can load in several files worth of data (in the attached xslx format) into one large database and to do the following:

1 - As each xslx file is loaded in:

- automatically add the name of the file as a field called 'original filename' in every record (starting with several files to be added as 'Blacklist'

- add an @ to the beginning of each 'screen name'

- find and delete (in bulk) records where 'screen name' is a repetition of one that already exists on the database, so there is only one of each remaining

2 - Search the resulting records and assign numbers to a 'priority' field (for example, search for all records where 'original filename' is Cribs and 'location' contains london and add a priority of 1 to all of those records.

3 - Sort the resulting records by 'priority', then export the 'screen name' column only to several CSV files, with a maximum number of records in each file (e.g. if the results had been sorted and filtered to show 14,000 records, and output length was set to 5,000 max, the result would be 3 sequentially numbered CSV files, two of 5,000 and one of 4,000. The exported records would then be tagged with filename: 'Blacklist'

Happy if the front end was in tabular/excel style as I am very familiar with excel. I originally wanted to do this task within excel, but have many hundreds of thousands of records to load in from 30 or so files, so it was not really feasible.

Cheers!

Charlie

Just as thought. Instead of the 'assigning priority and then sorting' stage, it may be simpler to be able to search within the location field for things containing a particular location, then export sequential files in 5k chunks that are labeled with both the original filename and location.

If it saved on processing power you could also remove all extra data from the files as they are loaded in and only leave the screen name (with @ added to the front), the location and the original filename.

Thanks

UPDATED BRIEF. For clarity. Ideally:

LOAD IN one or more files in the xlsx format (all at the same time).

AUTOMATICALLY as each file is loaded in:
- Import the 'screen name' column and add an @ to the beginning of it
- Import the 'location' column
- Add a third column with the 'filename' that each record was loaded in from (at the beginning, several files named BLACKLIST will be loaded in)
- Check 'screen name' against any records that already exist in the database and delete duplicates

Then allow an EXPORT SEARCH where the user can specify 2 things:
- 'location' - allowing a search where the location field CONTAINS the specified word (this variable could also be set to 'any location')
- 'filename' - default is set to filename DOES NOT CONTAIN blacklist, but could be changed by the user to specify that it EQUALS a specific filename.
- file length - where the default would be set to 5,000 to specify how many records in each file

When a user has specified the above criteria they can press the 'Export' button with the following result:
- Several csv files of the specified size would be created.
- They would contain only the 'screen name' column
- They would be named sequentially according to the original filename and the specified search terms. Eg if someone had loaded in one file named 'Cribs and one named 'Bastille', then specified 'london' and 5000 in the search terms, finding 11,000 bastille records and 7,000 for cribs, the resulting files would be:

Bastille-london-5000-1
Bastille-london-5000-2
Bastille-london-5000-3
Cribs-london-5000-1
Cribs-london-5000-2

Lastly, every record that is successfully exported in this way would automatically have its 'filename' changed to BLACKLIST.

Thanks

Charlie

Розробка баз даних Програмування баз даних Microsoft MySQL

ID Проекту: #6093325

Про проект

4 заявок(-ки) Дистанційний проект Остання активність Jun 21, 2014

Доручено:

acronymsolutions

Hi, I have gone through your requirements and attachment as well.I am a DBA at CMMI-5 Level company since August 2010.I am working on MS Access,MySql,SQL Server,Oracle database, I can do your job [login to view URL] your r Більше

£200 GBP за 8 дні(-в)
(25 відгуків(-и))
5.2

4 фрілансерів(-и) готові виконати цю роботу у середньому за £183

aneeshmonn

I have 5+ years of work experience in handling large data set processing using sql, PL/SQL, PostgreSQL, and then with ETL tools like Pentaho. I believe I am the right candidate for your project. I have couple of Більше

£222 GBP за 7 дні(-в)
(0 відгуків(и))
0.0
salmanhameed

Hello, I have already done this task for an export and import company where CBR issue alot of excel files and we have to manipulate data base on HS Code, so i think i can perform this task in no time. looking forward Більше

£150 GBP за 7 дні(-в)
(0 відгуків(и))
0.0