excel - Extracting XML attribute using VBA -
i'm not developer , have limited xml knowledge i've learned past 3-4 days researching on web. apologies in advance basic level of question. i'm trying wrap 1 time task.
i have vba excel knowledge , i'm trying use vba extract sic code attribute given company's page on sec filing website. example, site walmart
in blue bar @ top can see 'sic: 5331' it's 5331 i'm trying return vba variable can populate spreadsheet. when right click in ie , clich view source part of page relevant reads in xml as:
<div id="contentdiv"> <!-- start filer div --> <div style="margin: 15px 0 10px 0; padding: 3px; overflow: hidden; background-color: #bcd6f8;"> <div class="mailer">mailing address <span class="maileraddress">702 southwest 8th street</span> <span class="maileraddress"> bentonville ar 72716 </span> </div> <div class="mailer">business address <span class="maileraddress">702 southwest 8th st</span> <span class="maileraddress">bentonville ar 72716 </span> <span class="maileraddress">5012734000</span> </div> <div class="companyinfo"> <span class="companyname">wal mart stores inc <acronym title="central index key">cik</acronym>#: <a href="/cgi-bin/browse-edgar?action=getcompany&cik=0000104169&owner=exclude&count=40">0000104169 (see company filings)</a></span> <p class="identinfo"><acronym title="standard industrial code">sic</acronym>: <a href="/cgi-bin/browse-edgar?action=getcompany&sic=5331&owner=exclude&count=40">5331</a> - retail-variety stores<br />state location: <a href="/cgi-bin/browse-edgar?action=getcompany&state=ar&owner=exclude&count=40">ar</a> | state of inc.: <strong>de</strong> | fiscal year end: 0131<br />(assistant director office: 2)<br />get <a href="/cgi-bin/own-disp?action=getissuer&cik=0000104169"><b>insider transactions</b></a> <b> issuer</b>. <br />get <a href="/cgi-bin/own-disp?action=getowner&cik=0000104169"><b>insider transactions</b></a> <b>reporting owner</b>. </p> </div> </div> </div>
in trying understand how vba might used extract sic, found following post on site:
query , parse xml attribute value xls using vba
i tried apply barrowc's answer copy/paste excel module , inserted path wal mart filings when step through debug.print "*****" i'm not getting n.text.
sub test4() dim d msxml2.domdocument60 dim ixmldomnodelist dim n ixmldomnode set d = new msxml2.domdocument60 d.async = false d.load ("http://www.sec.gov/cgi-bin/browse-edgar?company=&match=&cik=886475&filenum=&state=&country=&sic=&owner=exclude&find=find+companies&action=getcompany") debug.print "*****" set = d.selectnodes("//div[@id='contentdiv']") each n in debug.print n.text next n debug.print "*****" set d = nothing end sub
i've tried various strings in d.selectnodes()
, don't know enough topic understand i'm going wrong. either comment on syntax or pointer resource enormously helpful.
if interested in sic, not worth time try parse entire dom structure. instead, identify unique set of characters, search , extract sic there.
the following function that. need pass full html source of page , return sic:
function extractsic(sourcehtml string) string const prefixchars string = "&sic=" const suffixchars string = "&" dim startpos long, endpos long startpos = instr(sourcehtml, prefixchars) if startpos = 0 exit function startpos = startpos + len(prefixchars) endpos = instr(startpos, sourcehtml, suffixchars) - 1 extractsic = mid(sourcehtml, startpos, endpos - startpos + 1) end function
Comments
Post a Comment