3 \"@sXddlZddlZddlZdgZejddZGdddZGdddZGdd d Z dS) NRobotFileParser RequestRatezrequests secondsc@sfeZdZdddZddZddZdd Zd d Zd d ZddZ ddZ ddZ ddZ ddZ dS)rcCs,g|_d|_d|_d|_|j|d|_dS)NFr)entries default_entry disallow_all allow_allset_url last_checked)selfurlr */usr/lib64/python3.6/urllib/robotparser.py__init__s  zRobotFileParser.__init__cCs|jS)N)r )r r r rmtime$szRobotFileParser.mtimecCsddl}|j|_dS)Nr)timer )r rr r rmodified-szRobotFileParser.modifiedcCs&||_tjj|dd\|_|_dS)N)r urllibparseurlparsehostpath)r r r r rr 5szRobotFileParser.set_urlcCsytjj|j}WnRtjjk rd}z2|jdkr:d|_n|jdkrT|jdkrTd|_WYdd}~XnX|j }|j |j dj dS)NTiizutf-8)rr) rZrequestZurlopenr errorZ HTTPErrorcoderrreadrdecode splitlines)r ferrrawr r rr:s zRobotFileParser.readcCs,d|jkr|jdkr(||_n |jj|dS)N*) useragentsrrappend)r entryr r r _add_entryGs  zRobotFileParser._add_entrycCs6d}t}|jx|D]}|sT|dkr8t}d}n|dkrT|j|t}d}|jd}|dkrr|d|}|j}|sq|jdd}t|dkr|djj|d<tj j |dj|d<|ddkr|dkr|j|t}|j j |dd}q|ddkr4|dkr|j j t|ddd}q|dd krh|dkr|j j t|dd d}q|dd kr|dkr|djjrt|d|_d}q|dd kr|dkr|djd }t|dkr|djjr|djjrtt|dt|d|_d}qW|dkr2|j|dS)Nrr#:z user-agentZdisallowFZallowTz crawl-delayz request-rate/)Entryrr(findstripsplitlenlowerrrunquoter%r& rulelinesRuleLineisdigitintdelayrreq_rate)r linesstater'lineiZnumbersr r rrPsd             zRobotFileParser.parsecCs|jr dS|jrdS|jsdStjjtjj|}tjjdd|j|j |j |j f}tjj |}|sfd}x"|j D]}|j|rn|j|SqnW|jr|jj|SdS)NFTrr,)rrr rrrr3 urlunparserZparamsZqueryZfragmentquoter applies_to allowancer)r useragentr Z parsed_urlr'r r r can_fetchs$    zRobotFileParser.can_fetchcCs4|js dSx|jD]}|j|r|jSqW|jjS)N)rrr@r8r)r rBr'r r r crawl_delays    zRobotFileParser.crawl_delaycCs4|js dSx|jD]}|j|r|jSqW|jjS)N)rrr@r9r)r rBr'r r r request_rates    zRobotFileParser.request_ratecCs0|j}|jdk r||jg}djtt|dS)N )rrjoinmapstr)r rr r r__str__s  zRobotFileParser.__str__N)r)__name__ __module__ __qualname__rrrr rr(rrCrDrErJr r r rrs    Cc@s$eZdZddZddZddZdS)r5cCs>|dkr| rd}tjjtjj|}tjj||_||_dS)NrT)rrr>rr?rrA)r rrAr r rrs zRuleLine.__init__cCs|jdkp|j|jS)Nr$)r startswith)r filenamer r rr@szRuleLine.applies_tocCs|jr dndd|jS)NZAllowZDisallowz: )rAr)r r r rrJszRuleLine.__str__N)rKrLrMrr@rJr r r rr5sr5c@s,eZdZddZddZddZddZd S) r-cCsg|_g|_d|_d|_dS)N)r%r4r8r9)r r r rrszEntry.__init__cCsg}x|jD]}|jd|q W|jdk r@|jd|j|jdk rj|j}|jd|jd|j|jtt|j |jddj |S)Nz User-agent: z Crawl-delay: zRequest-rate: r,rrF) r%r&r8r9ZrequestsZsecondsextendrHrIr4rG)r ZretagentZrater r rrJs    z Entry.__str__cCsF|jddj}x.|jD]$}|dkr*dS|j}||krdSqWdS)Nr,rr$TF)r0r2r%)r rBrQr r rr@s zEntry.applies_tocCs$x|jD]}|j|r|jSqWdS)NT)r4r@rA)r rOr<r r rrAs   zEntry.allowanceN)rKrLrMrrJr@rAr r r rr-s  r-) collectionsZ urllib.parserZurllib.request__all__ namedtuplerrr5r-r r r r s 2