[FREE] HPass - HTML Parser (Beta v0.1.0)

HPass - HTML Parser

Extension for fetching, extracting, and parsing data.

📄 Description

HPass - HTML Parser is an extension that can be used for fetching, extracting, and parsing data.


🧩 Documentation (Blocks)

Create Connection

305x79

Creates Connection with the specified URL of Website

Connection Timeout

313x79

Configures the total connection timeout duration to get the document. The timeout is specified in milliseconds. The default timeout is 30 seconds (30,000 milliseconds). A timeout of zero is treated as an infinite timeout.

Connection User-Agent

330x79

Configures the request User-Agent Header. Some websites do not allow access to their website unless the User-Agent is specified.

Connection Cookies

314x79

Adds each of the cookies to the connection. Cookies Parameter must be a dictionary with the name of the cookie and its value.

Connection Parameters

337x79

Adds each of the parameters or query string to the connection. The parameters must be a dictionary with the name of the parameter and its value. Connection parameters are sent in the request as a query string.

Connection Follow Redirects

365x79

Configures whether the connection should follow server redirects. By default this is true.

Connection Headers

316x79

Adds each of the headers to the connection. Headers must be a dictionary with the name of the header and its value.

Connection Ignore HTTP Errors

381x79

Configures the connection to not throw errors when an HTTP error occurs. Example: 404, 500 etc. By default, this is set to false. If configured to true, the error may be reflected in the HTML with the description if any error occurs.

Connection Ignore Content-Type

385x79

Configures the connection to not throw errors when the content type is inappropriate. If set to true it ignores the document’s content type when parsing the document and it tries to parse regardless of its content type.

Connection Proxy

298x104

Configures HTTP proxy to use for this request.

Connection Referrer

315x79

Configures the referrer header.

Connection Request Body

349x79

Configures the request body. It is useful when a server expects a plain request body.

Get Document

277x79

Executes get request with the specified connection ID. After the document is received when GotDocument is triggered.

Parse

233x75

Parses the given HTML.



Get Body

252x50

Returns the body element.

Get Head

253x50

Returns the head element.

Get Title

246x50

Returns the title of the website.

Is Charset Supported

125x75

Checks whether the specified charset is supported for the website.

Pretty Print

263x50

Returns if pretty print is enabled.

Get Whole Text

263x50

Returns the (unencoded) text of all children of this element, including any newlines and spaces present in the original.

Get Outer HTML

285x50

Returns the the outer HTML of this element. For example, on a pelement, this may return <p>Kodular</p>.

Tag Name

257x50

Returns the name of the tag for this element. Example: div.

Get Parent

261x50

Return this element’s parent element.

Get ID

230x50

Returns the id attribute of this element.

Own Text

251x50

Returns the (normalized) text owned by this element only; does not get the combined text of all children. For example, given HTML <p>Hello <b>Amazing </b>Kodular!</p> , returns "Hello Kodular!".
Note: The text within the b element is not returned, as it is not a direct child of the p element.

Node Name

269x50

Returns the node name of this node. Mainly used for debugging purposes.

Normal Name

31

Returns the normalized name of this Element’s tag. This will always be the lowercased version of the tag, regardless of the tag case. For Example: <DIV> and <div> both have a normal name of div.

Is Block

242x50

Checks whether this element is a block-level element. (Example: <div> returns true or an inline element <span> returns false ).

Matches

250x75

Checks whether this element matches the given query. Also knows as matches() in the Web DOM.

Has Attribute

275x75

Checks whether the element has attribute. This is Case insensitive.

Has Text

247x50

Checks whether the element has any text content (that is not just whitespace).

Has Class

257x75

Checks whether this element has a class. This is Case insensitive.

Get Absolute URL

294x75

Returns absolute URL from a URL whose element attribute may be relative (such as an <a href> or <img src> ). This is Case insensitive.

Get Attribute Value

309x75

Returns an attribute’s value by its specified key.

Get Element Sibling Index

352x50

Returns the index of this element in its element sibling list.
Note: If this is the first element sibling, returns 0.

Get Data

249x50

Returns the combined data of this element. Data should be used for the contents of scripts, comments, CSS styles, etc.

Get Data Set

56

Returns the specified element’s HTML5 custom data attributes. Each attribute in the element that has a key starting with data- is included in the dataset. It returns a dictionary with keys and values.

Get Number Of Attributes

348x50

Returns the number of attributes that the specified element has.

Get Classes

271x50

Returns the all of the specified element’s class names.

Get HTML

249x50

Returns the specified element’s inner HTML.

Get Text

249x50

Returns the normalized, combined text of this element and all its children. Whitespace is normalized and trimmed.
Note: This returns the textual content that would be presented to a reader.

Get Attributes

280x50

Returns a dictionary of attributes with key and value of each attribute.



Convert Document to Element

382x50

Converts the document to element, which is helpful for various functions.

Get Elements By Sibling Index Greater Than

459x75

Returns the list of elements whose sibling index is greater than the specified index.

Get Elements By Sibling Index Lesser Than

454x75

Returns the list of elements whose sibling index is lesser than the specified index.

Get Elements By Sibling Index Equals

421x75

Returns the list of elements whose sibling index is equal to the specified index.

Get Elements Containing Text

377x75

Returns the list of elements that contain the text specified. The search is case insensitive. The text may appear directly in the element, or in any of its descendants.

Get Elements By Class

333x75

Returns the list of elements which have the specified class, including or under this element. This is case insensitive.

Get Elements By Attribute Value Prefix

425x100

Returns the list of elements that have the attribute with the specified key and has the value which is the prefix of the parameter specified. This is case insensitive.

Get Elements By Attribute Value Not

410x100

Returns the list of elements that either does not have the specified attribute or have it with a different value. This is case insensitive.

Get Elements By Attribute Value Suffix

424x100

Returns the list of elements that have the specified attribute which ends with the specified suffix value. This is case insensitive.

Get All Elements

296x50

Returns the list of elements under this element (including self, and children of children).

Get Elements By Attribute

351x75

Returns the list of elements that have this attribute key.

Get Elements By Attribute Key Prefix

414x75

Returns the list of elements which have an attribute name starting with the specified prefix. Use may use data- to find elements that have HTML5 datasets.

Get Elements By Attribute Value Containing

458x100

Returns the list of elements that have attributes whose value contains the match string. This is case insensitive.

Get Elements By Attribute Key Value

413x100

Returns the list of elements which have attributes with the specific value. This is case insensitive.

Get Element By ID

302x75

Returns element whose id is equal to the specified id, including or under this element.
Note: This finds the first matching ID, starting with this element. There may be more than one id in an element with a specified id.

Get First Element In Siblings

364x50

Returns the first element sibling of this element. This may be the same element as specified.

Get Elements By Tag Name

359x75

Returns the list of elements including and recursively under the specified element, with the specified tag name.

Select

236x75

Returns the list of elements that match the CSS query, with the specified element as the starting context. Matched elements may include the specified element or any of its children. This is querySelectorAll() in the Web DOM.



Error

224x85

Triggered when an error occurs.

Got Document

282x85

Triggered when the document has been gotten.



Usage (How to Apply)

Establishing Connection

539x193


423x115

Parsing
411x167



🎥 Examples

Custom Search Engine

368x26







Note: Search Results by Google

Retrieve Favicon & Title
520x193





Note: Usually the last element is the largest icon. Websites loading favicon with Javascript may not work.


⚠️ Information

Results may not be as desired if the website uses JavaScript to load content as this extension is the parser.

The Connection functions takes time depending on the website to get the document.


🔗 Reference


📁 Download

v0.1.0.aix (346.2 KB)

15 Likes

Amazing extension… Good job​:+1:

Nice Extension with a big documentation. Good job buddy :+1:

It’s great

Very useful.
Great work man