OpenCV detect numbers in table










0














I'm trying to clean up images containing tables of numbers for OCR.
You can see a sample here:



Test image 1





My current pipeline is as follow:



1/ Resize image to have width=256, keep aspect ratio



h, w = img.shape[:2]
ratio = 256 / w
img = cv2.resize(img, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_LANCZOS4)


2/ Convert it to grayscale



gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


3/ As images tend to have table borders near the edges, I remove 3px from the image borders.



gray = gray[3:-3, 3:-3]


The following 2 steps are taken from PyImageSearch



4/ Apply Gaussian blur to remove some noise



blurred = cv2.GaussianBlur(gray, (3,3), 0)


5/ Apply blackhat operator (not sure if necessary)



kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (13,5))
blackhat = cv2.morphologyEx(blurred, cv2.MORPH_BLACKHAT, kernel)


6/ Detect and remove long lines (table borders) with HoughLines



edges = imutils.auto_canny(blurred)
# horizontal lines
hlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(w*.8)),
min_theta=np.radians(85),
max_theta=np.radians(95))
horizontal = if hlines is None else [line[0] for line in hlines]
# vertical lines
vlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(h*.8)),
min_theta=np.radians(-5),
max_theta=np.radians(5))
vertical = if vlines is None else [line[0] for line in vlines]
# merge nearby lines using a long and boring function
horizontal = merge_lines(horizontal)
vertical = merge_lines(horizontal)
# draw all the remaining lines onto the blackhat image
# width=3px, color=0 (black) to remove table borders
blackhat = draw_lines(horizontal, blackhat, 0, 3)
blackhat = draw_lines(vertical, blackhat, 0, 3)


7/ (From PyImageSearch) Compute Scharr gradient, then use Otsu threshold to detect the text region



def scharr_gradient(img):
sobel_x = cv2.Sobel(img, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1)
sobel_x = np.absolute(sobel_x)
(min_, max_) = (np.min(sobel_x), np.max(sobel_x))
sobel_x = (255 * ((sobel_x - min_) / (max_ - min_))).astype(np.uint8)
return sobel_x

scharr = scharr_gradient(blackhat)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,5))
closed = cv2.morphologyEx(scharr, cv2.MORPH_CLOSE, kernel)
_, thresh = cv2.threshold(closed, 0, 255, cv2.THRESH_OTSU)


8/ Apply the mask to the original grayscale image to get a clean image



mask = np.bitwise_not(thresh).astype(np.float32)
masked = np.clip(mask + gray, 0, 255).astype(np.uint8)


The problems:



  1. The Scharr operation at step 7 has a hard time detecting minus signs. Can you suggest a better way to localize the text?

  2. My current pipeline fails on noisy image like this. Is there anything I can do to deal with it? I tried playing with contrast, but it worsen other cases.

Other things I tried:



  1. Remove table border by detecting largest connected component didn't work because some numbers are connected to the borders, and sometimes the table borders are broken due to bad scan quality.









share|improve this question























  • can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
    – Link
    Nov 13 '18 at 21:31















0














I'm trying to clean up images containing tables of numbers for OCR.
You can see a sample here:



Test image 1





My current pipeline is as follow:



1/ Resize image to have width=256, keep aspect ratio



h, w = img.shape[:2]
ratio = 256 / w
img = cv2.resize(img, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_LANCZOS4)


2/ Convert it to grayscale



gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


3/ As images tend to have table borders near the edges, I remove 3px from the image borders.



gray = gray[3:-3, 3:-3]


The following 2 steps are taken from PyImageSearch



4/ Apply Gaussian blur to remove some noise



blurred = cv2.GaussianBlur(gray, (3,3), 0)


5/ Apply blackhat operator (not sure if necessary)



kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (13,5))
blackhat = cv2.morphologyEx(blurred, cv2.MORPH_BLACKHAT, kernel)


6/ Detect and remove long lines (table borders) with HoughLines



edges = imutils.auto_canny(blurred)
# horizontal lines
hlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(w*.8)),
min_theta=np.radians(85),
max_theta=np.radians(95))
horizontal = if hlines is None else [line[0] for line in hlines]
# vertical lines
vlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(h*.8)),
min_theta=np.radians(-5),
max_theta=np.radians(5))
vertical = if vlines is None else [line[0] for line in vlines]
# merge nearby lines using a long and boring function
horizontal = merge_lines(horizontal)
vertical = merge_lines(horizontal)
# draw all the remaining lines onto the blackhat image
# width=3px, color=0 (black) to remove table borders
blackhat = draw_lines(horizontal, blackhat, 0, 3)
blackhat = draw_lines(vertical, blackhat, 0, 3)


7/ (From PyImageSearch) Compute Scharr gradient, then use Otsu threshold to detect the text region



def scharr_gradient(img):
sobel_x = cv2.Sobel(img, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1)
sobel_x = np.absolute(sobel_x)
(min_, max_) = (np.min(sobel_x), np.max(sobel_x))
sobel_x = (255 * ((sobel_x - min_) / (max_ - min_))).astype(np.uint8)
return sobel_x

scharr = scharr_gradient(blackhat)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,5))
closed = cv2.morphologyEx(scharr, cv2.MORPH_CLOSE, kernel)
_, thresh = cv2.threshold(closed, 0, 255, cv2.THRESH_OTSU)


8/ Apply the mask to the original grayscale image to get a clean image



mask = np.bitwise_not(thresh).astype(np.float32)
masked = np.clip(mask + gray, 0, 255).astype(np.uint8)


The problems:



  1. The Scharr operation at step 7 has a hard time detecting minus signs. Can you suggest a better way to localize the text?

  2. My current pipeline fails on noisy image like this. Is there anything I can do to deal with it? I tried playing with contrast, but it worsen other cases.

Other things I tried:



  1. Remove table border by detecting largest connected component didn't work because some numbers are connected to the borders, and sometimes the table borders are broken due to bad scan quality.









share|improve this question























  • can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
    – Link
    Nov 13 '18 at 21:31













0












0








0







I'm trying to clean up images containing tables of numbers for OCR.
You can see a sample here:



Test image 1





My current pipeline is as follow:



1/ Resize image to have width=256, keep aspect ratio



h, w = img.shape[:2]
ratio = 256 / w
img = cv2.resize(img, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_LANCZOS4)


2/ Convert it to grayscale



gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


3/ As images tend to have table borders near the edges, I remove 3px from the image borders.



gray = gray[3:-3, 3:-3]


The following 2 steps are taken from PyImageSearch



4/ Apply Gaussian blur to remove some noise



blurred = cv2.GaussianBlur(gray, (3,3), 0)


5/ Apply blackhat operator (not sure if necessary)



kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (13,5))
blackhat = cv2.morphologyEx(blurred, cv2.MORPH_BLACKHAT, kernel)


6/ Detect and remove long lines (table borders) with HoughLines



edges = imutils.auto_canny(blurred)
# horizontal lines
hlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(w*.8)),
min_theta=np.radians(85),
max_theta=np.radians(95))
horizontal = if hlines is None else [line[0] for line in hlines]
# vertical lines
vlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(h*.8)),
min_theta=np.radians(-5),
max_theta=np.radians(5))
vertical = if vlines is None else [line[0] for line in vlines]
# merge nearby lines using a long and boring function
horizontal = merge_lines(horizontal)
vertical = merge_lines(horizontal)
# draw all the remaining lines onto the blackhat image
# width=3px, color=0 (black) to remove table borders
blackhat = draw_lines(horizontal, blackhat, 0, 3)
blackhat = draw_lines(vertical, blackhat, 0, 3)


7/ (From PyImageSearch) Compute Scharr gradient, then use Otsu threshold to detect the text region



def scharr_gradient(img):
sobel_x = cv2.Sobel(img, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1)
sobel_x = np.absolute(sobel_x)
(min_, max_) = (np.min(sobel_x), np.max(sobel_x))
sobel_x = (255 * ((sobel_x - min_) / (max_ - min_))).astype(np.uint8)
return sobel_x

scharr = scharr_gradient(blackhat)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,5))
closed = cv2.morphologyEx(scharr, cv2.MORPH_CLOSE, kernel)
_, thresh = cv2.threshold(closed, 0, 255, cv2.THRESH_OTSU)


8/ Apply the mask to the original grayscale image to get a clean image



mask = np.bitwise_not(thresh).astype(np.float32)
masked = np.clip(mask + gray, 0, 255).astype(np.uint8)


The problems:



  1. The Scharr operation at step 7 has a hard time detecting minus signs. Can you suggest a better way to localize the text?

  2. My current pipeline fails on noisy image like this. Is there anything I can do to deal with it? I tried playing with contrast, but it worsen other cases.

Other things I tried:



  1. Remove table border by detecting largest connected component didn't work because some numbers are connected to the borders, and sometimes the table borders are broken due to bad scan quality.









share|improve this question















I'm trying to clean up images containing tables of numbers for OCR.
You can see a sample here:



Test image 1





My current pipeline is as follow:



1/ Resize image to have width=256, keep aspect ratio



h, w = img.shape[:2]
ratio = 256 / w
img = cv2.resize(img, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_LANCZOS4)


2/ Convert it to grayscale



gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)


3/ As images tend to have table borders near the edges, I remove 3px from the image borders.



gray = gray[3:-3, 3:-3]


The following 2 steps are taken from PyImageSearch



4/ Apply Gaussian blur to remove some noise



blurred = cv2.GaussianBlur(gray, (3,3), 0)


5/ Apply blackhat operator (not sure if necessary)



kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (13,5))
blackhat = cv2.morphologyEx(blurred, cv2.MORPH_BLACKHAT, kernel)


6/ Detect and remove long lines (table borders) with HoughLines



edges = imutils.auto_canny(blurred)
# horizontal lines
hlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(w*.8)),
min_theta=np.radians(85),
max_theta=np.radians(95))
horizontal = if hlines is None else [line[0] for line in hlines]
# vertical lines
vlines = cv2.HoughLines(edges,1,np.pi/180,min(100,int(h*.8)),
min_theta=np.radians(-5),
max_theta=np.radians(5))
vertical = if vlines is None else [line[0] for line in vlines]
# merge nearby lines using a long and boring function
horizontal = merge_lines(horizontal)
vertical = merge_lines(horizontal)
# draw all the remaining lines onto the blackhat image
# width=3px, color=0 (black) to remove table borders
blackhat = draw_lines(horizontal, blackhat, 0, 3)
blackhat = draw_lines(vertical, blackhat, 0, 3)


7/ (From PyImageSearch) Compute Scharr gradient, then use Otsu threshold to detect the text region



def scharr_gradient(img):
sobel_x = cv2.Sobel(img, ddepth=cv2.CV_32F, dx=1, dy=0, ksize=-1)
sobel_x = np.absolute(sobel_x)
(min_, max_) = (np.min(sobel_x), np.max(sobel_x))
sobel_x = (255 * ((sobel_x - min_) / (max_ - min_))).astype(np.uint8)
return sobel_x

scharr = scharr_gradient(blackhat)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,5))
closed = cv2.morphologyEx(scharr, cv2.MORPH_CLOSE, kernel)
_, thresh = cv2.threshold(closed, 0, 255, cv2.THRESH_OTSU)


8/ Apply the mask to the original grayscale image to get a clean image



mask = np.bitwise_not(thresh).astype(np.float32)
masked = np.clip(mask + gray, 0, 255).astype(np.uint8)


The problems:



  1. The Scharr operation at step 7 has a hard time detecting minus signs. Can you suggest a better way to localize the text?

  2. My current pipeline fails on noisy image like this. Is there anything I can do to deal with it? I tried playing with contrast, but it worsen other cases.

Other things I tried:



  1. Remove table border by detecting largest connected component didn't work because some numbers are connected to the borders, and sometimes the table borders are broken due to bad scan quality.






python opencv ocr






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 13 '18 at 14:14









rmtheis

4,98084363




4,98084363










asked Nov 13 '18 at 2:47









Nghia TruongNghia Truong

1




1











  • can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
    – Link
    Nov 13 '18 at 21:31
















  • can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
    – Link
    Nov 13 '18 at 21:31















can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
– Link
Nov 13 '18 at 21:31




can you show the image you finally obtained ? this kind of task should not be too difficult. Take a look at my questions/answers.
– Link
Nov 13 '18 at 21:31












0






active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273057%2fopencv-detect-numbers-in-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53273057%2fopencv-detect-numbers-in-table%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Node.js Script on GitHub Pages or Amazon S3

Museum of Modern and Contemporary Art of Trento and Rovereto