Stripping text from the alt tag of a webpage with regex
March 13, 2008
The other day my fiance needed to get some info from a webpage that was in the alt tags. I decided to write up a script that would do it in php.
I’m not the best at regular expressions so I had to look up on the web on how to do it. Regex is a pain to understand. But if you work at it you start to realize how much power it has.
So without much further ado here is the regex, that suited me for this situation, to find out the text that’s in the alt tags of a webpage.
(alt=\”[A-Za-z: 0-9-\.]+\”)
First thing I did was to download the webpage onto a server.
I opened up the file with fopen.
did an explode to load up the content into an array.
did a foreach to loop through the array and inside that i did a preg_match_all to find what i was looking for.
If you guys have any comments or questions let me know.
Got something to say?
You must be logged in to post a comment.


