Dismiss Notice

REGISTRATION IS AFTER ADMIN ACTIVATION

DONATIONS WITH PAYPAL CLICK TO BUTTON

3 MONTHS VIP - 10$; 6 MONTHS VIP - 20$; 1 YEAR VIP - 30$; 2 YEARS VIP - 50$; GOLD USER FOR LIFE VIP - 150$

DONATIONS WITH Bitcoin Address:3NRs3CK3fhXifrNYxHEZKpETDd9vNLMsMD

Dismiss Notice
The registration is approved by the Administrator. It takes about 1 day to approve your registration
Dismiss Notice
For open hidden message no need write thanks, thank etc. Enough is click to like button on right side of thread.

Extracting PHP/HTML code

Discussion in 'Delphi Help&Requests' started by gundulapek, Oct 13, 2014.

  1. gundulapek
    Online

    gundulapek Guest

    Edited: For showing the image

    I have a problem when I want to extract price from a website www.starcitygames.com, like image below
    [​IMG]

    using this procedure the result i get is like image 2
    Code:
    procedure TForm1.WebBrowser1DocumentComplete(ASender: TObject;
      const pDisp: IDispatch; const URL: OleVariant);
    var
    Document: IHTMLDocument2;
    coll: IHTMLElementCollection;
    x: Integer;
    dispatch: IDispatch;
    e: IHTMLElement;
    tagName: string;
    anchor: IHTMLAnchorElement;
    image: IHTMLImgElement;
    divtag : IHTMLDivElement;
    begin
      Memo1.Clear;
      if not Succeeded(WebBrowser1.Document.QueryInterface(IID_IHTMLDocument2, Document)) then
        Exit;
    
      coll := Document.all;
    
      for x := 0 to coll.length - 1 do
      begin
        dispatch := coll.item(x, x);
        if dispatch = nil then
          Continue;
        if not Succeeded(dispatch.QueryInterface(IID_IHTMLElement, e)) then
          Continue;
        dispatch := nil;
    
        tagName := LowerCase(e.tagName);
        if tagName = 'a' then
        begin
          if not Succeeded(e.QueryInterface(IID_IHTMLAnchorElement, anchor)) then
            Continue;
          Memo1.Lines.Add('A:' + e.innerText + ' ' + anchor.href);
        end
        else if tagName = 'img' then
        begin
          if not Succeeded(e.QueryInterface(IID_IHTMLImgElement, image)) then
            Continue;
            Memo1.Lines.Add('IMG: ' + image.src);
        end
        else if tagName = 'div' then
        begin
          if not Succeeded(e.QueryInterface(IID_IHTMLImgElement, image)) then
            Continue;
            //Memo1.Lines.Add('DIV: ' + divtag.src);
        end;
      end;
    end;
    [​IMG]

    I know that the page source using div, but i don't know how to get the value, I tried to save the webpage but the result is the stock and price is not shown, looking for the page source there is no sign of price shown at it.
    [​IMG]
    Is there any way i could grab the price?

    thanks
     

    Attached Files:

  2. SPCoeur
    Online

    SPCoeur Guest

    You actually need an html parser. I'm also looking for this in Delphi.
     
  3. N0body
    Online

    N0bodyN0body is a Verified Member DelphiFan Administrator Staff Member DF Staff

    i think you can use JvHTMLParser from JEDI components
     
  4. N0body
    Online

    N0bodyN0body is a Verified Member DelphiFan Administrator Staff Member DF Staff

    also you can user this component
    http://www.delphifan.com/forum/Thread-DIHtmlParser-v6-5-0-XE2-XE3-32-64?highlight=DIHtmlParser

    but need ported to XE6 or XE7
     
  5. gundulapek
    Online

    gundulapek Guest

    I use delphi xe 2, the problem when i look for the page source there is no sign of the price
    <td class="deckdbbody search_results_7"><a href="http://www.starcitygames.com/pages/cardconditions">NM/M</a></td>
    <td class="deckdbbody search_results_8"><div class="CrCqMQ nwXfjp2 Lstghu">&nbsp;</div><div class="pItbQj2 CrCqMQ Lstghu">&nbsp;</div> << it is the stock i think </td>
    <td class="deckdbbody search_results_9"><div style='width:45px'><div class="Lstghu">$</div> << the '$' but no number for the price
    <div class="Lstghu CrCqMQ AwkgNk2">&nbsp;</div><div class="xSmJwc2 Lstghu CrCqMQ">&nbsp;</div><div class="Lstghu CrCqMQ cBkAee2">&nbsp;</div><div class="xvmSIh2 Lstghu CrCqMQ">&nbsp;</div></div></td>


    but where is the number of price ? >.<
     
  6. N0body
    Online

    N0bodyN0body is a Verified Member DelphiFan Administrator Staff Member DF Staff

    I think this mane use encode for this like this :

    <div class="YSssla">$</div>  for $ this is not problem but
    <div class="GDrmVd YSssla chCmUM">&nbsp;</div>   this is 2
    <div class="YSssla chCmUM kLtjTJ">&nbsp;</div>   this is .
    <div class="YSssla chCmUM AFOLoW">&nbsp;</div>   this is 9
    <div class="chCmUM AFOLoW YSssla">&nbsp;</div>   this is 9

    price is $2.99
     
  7. gundulapek
    Online

    gundulapek Guest

    but when i look the other card with the same price and then i compare it the result is different

    the only answer in my mind is passing parameter with <div class= XXXXX>
     
  8. N0body
    Online

    N0bodyN0body is a Verified Member DelphiFan Administrator Staff Member DF Staff

    i thinko not class 
    class is just html code for web sites.
    This man use different encoding for prices i dont know how decode this.
     
  9. gundulapek
    Online

    gundulapek Guest

    thanks .. will try something
     
  10. gundulapek
    Online

    gundulapek Guest

    I try to dig the page source deeper, and found something interesting

    Code:
    <style>
    /* Agis artes sunt infirma */
    .iZmsgF {background-image:url(//sales.starcitygames.com/price_icons.php?id=8Hb_jUcAf95D0Hda_tKLJfDKJrOZHighM2J3MfUrqZc);	}	
    .rdhVlK {width:7px; float:left;	height:14px;}
    .HqbaaF {background-position:0px -2px;}
    .HqbaaF2 {background-position:0px 21px;}
    .RWpWLm {background-position:-14px -2px;}
    .RWpWLm2 {background-position:-14px 21px;}
    .duxkLw {background-position:-63px -2px;width:3px; }
    .duxkLw2 {background-position:-63px 21px;width:3px; }
    .Xtdtoz {background-position:-66px -2px;}
    .Xtdtoz2 {background-position:-66px 21px;}
    .gzwEgb {background-position:-35px -2px;}
    .gzwEgb2 {background-position:-35px 21px;}
    .vriNrU {background-position:-49px -2px;}
    .vriNrU2 {background-position:-49px 21px;}
    .hndFSf {background-position:-7px -2px;}
    .hndFSf2 {background-position:-7px 21px;}
    .iVndks {background-position:-56px -2px;}
    .iVndks2 {background-position:-56px 21px;}
    .hjuwjk {background-position:-21px -2px;}
    .hjuwjk2 {background-position:-21px 21px;}
    .uyluJv {background-position:-42px -2px;}
    .uyluJv2 {background-position:-42px 21px;}
    .wsenFo {background-position:-28px -2px;}
    .wsenFo2 {background-position:-28px 21px;}
    </style>
    
    I have tried that this encryption is another great idea, the page will load image and move the position until we get the number, the example is shown below

    [attachment=57]

    The image will change according to
    Code:
    .iZmsgF {background-image:url(//sales.starcitygames.com/price_icons.php?id=8Hb_jUcAf95D0Hda_tKLJfDKJrOZHighM2J3MfUrqZc);	}
    
    and now still need to know, about DIU, is there any references or just simple component?
    or maybe another way to translate the image into numbers

    thank you
     

    Attached Files:

  11. Dergen
    Offline

    Dergen DF Member

    Like you say, the CSS has a link to a separate image for each digit in the price.
    They went through some fancy programming to keep people from scraping the price.
    And you will need some fancy programming to extract it.

    They could change the encoding for every page load if they wanted to.

    What I want to know, is there a big value in knowing and extracting these prices?
     
  12. gundulapek
    Online

    gundulapek Guest

    starcitygames is our price reference in Indonesia when we want to sell singles magic the gathering trading card game, so my goal is to help me (i own an official card shop of MTG) in pricing. I need to know the price changing so it will be easier to update the card price in local currency

    thx :)
     

Share This Page

Laws
Our website, law, laws, copyright and aims to be respectful of individual rights. Our site, as defined in the law of 5651 serves as a provider of space. According to the law, contrary to the law of site management has no obligation to monitor content. Therefore, our site has adopted the principle of fit and remove. Works are subject to copyright laws and is shared in a manner that violated their legal rights, or professional associations, rights holders who, adsdelphi@gmail.com I can reach us at e-mail address. Complaints considered to be infringing on the examination of the content will be removed from our site.
Donate Help To Us and Be VIP
DONATIONS WITH PAYPAL CLICK TO BUTTON
6 MONTHS VIP - 20$; 1 YEAR VIP - 30$; 2 YEARS VIP - 50$; GOLD USER FOR LIFE VIP - 150$
Social Block