Type Definitions

OCRCoordinates / Bounds

Used for determining a location on the screen.
export interface OCRCoordinates {
top: number; // coordinates in pixels of the text bounding box top left
// corner starting from left of the image
left: number; // coordinates in pixels of the text bounding box top left
// corner starting from left of the image.
width: number; // the width of the text bounding box in pixel.
height: number; // the height of the text bounding box in pixel
}
type Bounds = {
top: number; // coordinates in pixels of the text bounding box top left
// corner starting from left of the image
left: number; // coordinates in pixels of the text bounding box top left
// corner starting from left of the image.
width: number; // the width of the text bounding box in pixel.
height: number; // the height of the text bounding box in pixel
}

OCRResult

OCRResult : {
level: number;
page_num: number;
block_num: number;
par_num: number;
line_num: number;
word_num: number;
left: number;
top: number;
width: number;
height: number;
confidence: number;
text: string;
}
Here is a summary description of each column, what they represent, and the range of valid values they can have.
  • level: hierarchical layout (a word is in a line, which is in a paragraph, which is in a block, which is in a page), a value from 1 to 5
    • 1: page
    • 2: block
    • 3: paragraph
    • 4: line
    • 5: word
  • page_num: when provided with a list of images, indicates the number of the file, when provided with a multi-pages document, indicates the page number, starting from 1
  • block_num: block number within the page, starting from 0
  • par_num: paragraph number within the block, starting from 0
  • line_num: line number within the paragraph, starting from 0
  • word_num: word number within the line, starting from 0
  • left: x coordinate in pixels of the text bounding box top left corner, starting from the left of the image
  • top: y coordinate in pixels of the text bounding box top left corner, starting from the top of the image
  • width: width of the text bounding box in pixels
  • height: height of the text bounding box in pixels
  • confidence: confidence value, from 0 (no confidence) to 100 (maximum confidence), -1 for all level except 5
  • text: detected text, empty for all levels except 5

HashType

Used for specifying which algorithm to use for hashing images.
export enum HashType {
Average,
Difference,
Perception,
}

OcrEvent

Used for listenOcrMonitor , specifying the bounds of the area and what texts are there before and after changes.
interface OcrEvent {
old: TextAndBounds;
new: TextAndBounds;
}
interface TextAndBounds {
bounds: Rectangle;
text: string;
}
type Bounds = {
top: number;
left: number;
width: number;
height: number;
};
type Rectangle = {
Max: Point;
Min: Point;
};
type Point = {
X: number;
Y: number;
};
}