Skip to content Skip to sidebar Skip to footer

Perl Parse Links From Html Table

I'm trying to get links from table in HTML. By using HTML::TableExtract, I'm able to parse table and get text (i.e. Ability, Abnormal in below example) but cannot get link that inv

Solution 1:

HTML::LinkExtor, passing the extracted table text to its parse method.

my $le = HTML::LinkExtor->new();

foreach$ts ($te->tables){
    foreach$row ($ts->rows){
        $le->parse($row->[0]);
        for my $link_tag ( $le->links ) {
            my ($tag, %links) = @$link_tag;
            # next if $tag ne 'a'; # exclude other kinds of links?printfor values %links;
        }
    }
}

Solution 2:

Use keep_html option in the constructor.

keep_html

Return the raw HTML contained in the cell, rather than just the visible text. Embedded tables are not retained in the HTML extracted from a cell. Patterns for header matches must take into account HTML in the string if this option is enabled. This option has no effect if extracting into an element tree structure.

$te = HTML::TableExtract->new( keep_html =>1, headers => [qw(field1 ... fieldN)]);

Post a Comment for "Perl Parse Links From Html Table"