Find Jobs
Hire Freelancers

Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity

£20-250 GBP

Завершено
Опублікований about 4 years ago

£20-250 GBP

Оплачується при отриманні
I need .NET code to highlight (and extract and concatenate) 2 or 3 or 4 separate text elements in a pdf, based on regular expressions and their proximity to each other (see attached image). 1.) I am using the DevExpress PdfDocumentProcessor to obtain the document text and coordinates using the [login to view URL] property 2.) Then, use the standard Regex class to get all substrings in the given string (the text returned with the [login to view URL] property) that matches your regular expression. Example text in Pdf: 240 TT 12345 Example Regex (should find the elements above individually): 1st Line: 3 Numeric Chars: ^\d{3}$ 2nd Line: 2 Alpha Chars: ^[A-Z]{2}$ 3rd Line: 5 Numeric Chars: ^\d{5}$ Criteria: All 3 text elements share same text height All 3 text elements have same (or close +- 10%) X coordinate Value All 3 text elements are within Y coordinates value of Char height * 4 +- 10% of each other Example text in Pdf: 240 TT 12345 Required concatenated string: 240TT12345 I'm guessing the workflow would be something along the lines of: Open pdf Extract all text elements Find text matching first line of regex Is there a text element the same character height with the same X coordinate value below this element (within the height of the text element above +-10%)? Is there another text element the same character height with the same X coordinate value below this element (within the height of the text element above +-10%)? If there is, extract all text elements, concatenated to string, e.g. 240TT12345 Highlight the elements in the pdf. I would class myself as an intermediate coder, but I'm really struggling here because the number of lines to search using regex can be 2, sometimes 3, maybe 4. Perhaps a LINQ query to find all by Regex and proximity however happy to see all suggestions.
ID проекту: 24739628

Про проект

9 пропозицій(-ї)
Дистанційний проект
Активність 4 yrs ago

Хочете заробити?

Переваги подання заявок на Freelancer

Вкажіть свій бюджет та терміни
Отримайте гроші за свою роботу
Опишіть свою пропозицію
Реєстрація та подання заявок у проекти є безкоштовними
Доручений:
Аватарка користувача
I have FULL CONFIDENCE of lending you a hand in sorting out your Regular Expressions problem and I am ready to start IMMEDIATELY. QUESTIONS/COMMENTS 1) It will be much beneficial if you can upload a small sample of [login to view URL] property that you are going to parse. I am asking this because I think it will contain contents of multiple text elements and I can clearly see what you mean by proximity. 2) Exactly what do you mean by text element? As I see it in attached image, there are 3 "figures" joined by dashed arrows. 1st figure contains 240-TE-24381, 2nd figure contains 240-TT-24381, and 3rd contains 240-TI-24381. Does the "figure" (e.g. 240-TE-24381) corresponds to a text element or individual parts within the figure, viz. 240, TE, 24381, constitute a text element? 3) I have not followed how X or Y offsets are related to RegExes. Please explain. EXPERIENCE Although new to Freelancer.com, I have EXTENSIVE experience in Regular Expressions and I am pretty much familiar with the RegEx “flavour” as implemented in .NET. Thus, I know that named capturing groups in .NET use (?<id>\w+) or (?'id'\w+) format while the syntax for named capturing groups is (?P<id>\w+). In addition to “regular” concepts such as Character classes, Anchors, Word boundaries, etc. I am also very much at home with concepts such as Atomic Grouping, Lookahead and Lookbehind. Thanks, Tushar
£69 GBP за 4 дні(-в)
5,0 (9 відгуки(-ів))
4,5
4,5
9 фрілансерів(-и) готові виконати цю роботу у середньому за £141 GBP
Аватарка користувача
Hello, I can help you with your project - Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Please visit : https://www.freelancer.com/u/schoudhary1553 I have excellent command over English. I am a hard worker, productive and worthy of your attention I hope, I would be the right candidate for this post. Awaiting an affirmative response from you. Kinds Regards, Sandeep
£220 GBP за 4 дні(-в)
5,0 (35 відгуки(-ів))
5,9
5,9
Аватарка користувача
I am PDF expert, I can write code to extract from raw pdf without libraries, it work for simple pdfs only. I hope your pdf like it, please send it to check
£200 GBP за 3 дні(-в)
5,0 (3 відгуки(-ів))
3,8
3,8
Аватарка користувача
Hi Claire J.! I'm a Graphic Designer, with over 6 years experience based in Vancouver, Canada. I've previously worked on pdf, vb.net for another employers. Please see my portfolio @ www.visak2691.com. I look forward to working on this project with you. Thank you, Vishakh
£142 GBP за 7 дні(-в)
4,5 (6 відгуки(-ів))
3,7
3,7
Аватарка користувача
-- VB.NET expert with PDF processing experience .......... Interested to do your project for regex matching ...........
£145 GBP за 7 дні(-в)
4,9 (7 відгуки(-ів))
3,1
3,1
Аватарка користувача
hello,dear. I have read all your requirements for 'Extract and concatenate to single string 2 or 3 or 4.. text elements in a pdf, by regex and proximity' and I fully understood it. I've already done this kind of project before. I am confident and I am sure that I am able to finish this project. Please come in contact with me, so that we can discuss any details via chat:) Skills: PDF, VB.NET
£150 GBP за 1 день
5,0 (2 відгуки(-ів))
2,5
2,5
Аватарка користувача
10+ years experience in C# Have experience in processing inconsistent Excel & PDF files. Can complete in a day.
£111 GBP за 1 день
5,0 (1 відгук)
1,9
1,9
Аватарка користувача
i need this project i do best work for you any employer contact me. i am professional data entry work,
£135 GBP за 7 дні(-в)
0,0 (0 відгуки(-ів))
0,0
0,0

Про клієнта

Прапор UNITED KINGDOM
Bagshot, United Kingdom
5,0
4
Спосіб оплати верифіковано
На сайті з вер. 8, 2017

Верифікація клієнта

Дякуємо! Ми надіслали на вашу електронну пошту посилання для отримання безкоштовного кредиту.
Під час надсилання електронного листа сталася помилка. Будь ласка, спробуйте ще раз.
Зареєстрованих користувачів Загальна кількість опублікованих робіт
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Завантажуємо для перегляду
Дозвіл на визначення геолокації надано.
Ваш сеанс входу закінчився, і сеанс було закрито. Будь ласка, увійдіть знову.