writing perl script to extract html....?

  • Thread starter Thread starter Jemi Sulivan
  • Start date Start date
J

Jemi Sulivan

Guest
Hello,

I would like to get your help regarding html source code extraction via Perl. Anybody could help me extract an html code without the tags? Example, I would like to insert the table content into a neatly done table. I am done with very brief answers. I will be quite thankful to get a detailed answer. Please notice that:
- I have quite good information in Perl scripting and programming as a whole.
- I have good information as well regarding html file content.

I hope only experts answer this for me.

Thanks a lot
 
This a piece of one of my scripts. it will get the pageurl and print it out. IN the foreach loop you would replace the print with whatever you want to do with the web page contents.


#setup of script
use LWP::UserAgent;
use strict;
use warnings;


my $pageURL; #url for
my $ua = new LWP::UserAgent;
$ua->proxy(['http'], 'http://my.proxy.com/'); # set proxy


$pageURL="http://url.com";

my $req = new HTTP::Request GET => $pageURL;
my $res = $ua->request($req);
my @contents;

if ($res->is_success)
{
(@contents) = split(/\n/,$res->content);
}
else
{
die "Could not get content";
}

foreach my $line (@contents){
print "$line\n";
}
 
Back
Top