deepak k jain
2005-03-03 13:12:43 UTC
dear all,
for a search engine application, i need to get all the words that are
there in HTML file. also, i need to know whether that word falls in
<H1> </H1> tag. i've used following code till now:
MSHTML::IHTMLDocument2Ptr pDoc;
.
.
< here i take HTML file in SAFEARRAY and fill my pDoc3 with that. >
.
.
IHTMLElementCollectionPtr pCollection,pChildCollection;
IHTMLElementPtr pElement,pChildElem;
IDispatch* pDispatch=NULL;
pDoc->get_all(&pCollection);
for ( long i=0;i<pCollection->length;i++)
{
pElement=pCollection->item(i,(long)0);
pElement->get_children(&pDispatch);
pDispatch->QueryInterface(IID_IHTMLElementCollection,(LPVOID*)pChildCollection);
for ( long j=0;j<pChildCollection->length;j++)
{
pChildElem=pChildCollection->item(j,(long)0);
< USE THIS pChildElem now to get each word>
}
}
Problem: pChildCollection returns NULL
i don't know whether my approach itself is right. on top of that,
pCHildCollection is returning NULL, so i don't know what to do now.
please give me some pointers
thanks,
deepak jain
for a search engine application, i need to get all the words that are
there in HTML file. also, i need to know whether that word falls in
<H1> </H1> tag. i've used following code till now:
MSHTML::IHTMLDocument2Ptr pDoc;
.
.
< here i take HTML file in SAFEARRAY and fill my pDoc3 with that. >
.
.
IHTMLElementCollectionPtr pCollection,pChildCollection;
IHTMLElementPtr pElement,pChildElem;
IDispatch* pDispatch=NULL;
pDoc->get_all(&pCollection);
for ( long i=0;i<pCollection->length;i++)
{
pElement=pCollection->item(i,(long)0);
pElement->get_children(&pDispatch);
pDispatch->QueryInterface(IID_IHTMLElementCollection,(LPVOID*)pChildCollection);
for ( long j=0;j<pChildCollection->length;j++)
{
pChildElem=pChildCollection->item(j,(long)0);
< USE THIS pChildElem now to get each word>
}
}
Problem: pChildCollection returns NULL
i don't know whether my approach itself is right. on top of that,
pCHildCollection is returning NULL, so i don't know what to do now.
please give me some pointers
thanks,
deepak jain