Parse Google News RSS image in C

Parse Google News RSS image in C


If you want to parse Google News RSS image from link, for example > http://news.google.com/news?hl=us&q=android&output=rss

The following code can be used to match all img src in the source text and to populate list with value of src attribute.

private static IEnumerable<string> GetImagesInGoogleNewsString(string htmlString)
        {
            List<string> imgSrcs = new List<string>();
            //const string pattern = Imgpattern;
            //var rgx = new Regex(pattern, RegexOptions.IgnoreCase);
            var imgSrcMatches = System.Text.RegularExpressions.Regex.Matches(htmlString, string.Format(@"<s*imgs*srcs*=s*{0}s*([^{0}]+)s*{0}", """),
               RegexOptions.CultureInvariant | RegexOptions.IgnoreCase | 
               RegexOptions.Multiline);

            foreach (Match match in imgSrcMatches)
                imgSrcs.Add("http:" + match.Groups[1].Value);

            return imgSrcs;
        }


visit link download

Comments