写点什么

selenium 源码通读·10 |webdriver/common/proxy.py-Proxy 类分析

作者:虫无涯
  • 2023-04-25
    陕西
  • 本文字数:1407 字

    阅读完需:约 5 分钟

1 源码路径

selenium/webdriver/common/proxy.py
复制代码


2 功能说明

  • 使用代理来规避一些反爬虫策略;

3 引入原因

  • selenium webdriver测试或进行爬虫时,如果未在目标服务的白名单内,随着爬取的频率增多会被禁止访问目标服务;

  • 目标服务的反爬虫策略相对完善的情况下,容易将 selenium 的爬取给禁止;

  • 采用分布式 selenium 爬取方式时,也会很容易的被禁止爬取;


综上,所以 Proxy 类提供了一些反爬虫策略,

4 代理类型

  • 源码:


class ProxyType:    """    Set of possible types of proxy.
Each proxy type has 2 properties: 'ff_value' is value of Firefox profile preference, 'string' is id of proxy type. """
DIRECT = ProxyTypeFactory.make(0, 'DIRECT') # Direct connection, no proxy (default on Windows). MANUAL = ProxyTypeFactory.make(1, 'MANUAL') # Manual proxy settings (e.g., for httpProxy). PAC = ProxyTypeFactory.make(2, 'PAC') # Proxy autoconfiguration from URL. RESERVED_1 = ProxyTypeFactory.make(3, 'RESERVED1') # Never used. AUTODETECT = ProxyTypeFactory.make(4, 'AUTODETECT') # Proxy autodetection (presumably with WPAD). SYSTEM = ProxyTypeFactory.make(5, 'SYSTEM') # Use system settings (default on Linux). UNSPECIFIED = ProxyTypeFactory.make(6, 'UNSPECIFIED') # Not initialized (for internal use).
复制代码


  • 说明:



5 代理类型模式或属性

class ProxyTypeFactory:    """    Factory for proxy types.    """
@staticmethod def make(ff_value, string): return {'ff_value': ff_value, 'string': string}
复制代码



6 各种代理实现

  • 源码:


class Proxy(object):    """    Proxy contains information about proxy type and necessary proxy settings.    """
proxyType = ProxyType.UNSPECIFIED autodetect = False ftpProxy = '' httpProxy = '' noProxy = '' proxyAutoconfigUrl = '' sslProxy = '' socksProxy = '' socksUsername = '' socksPassword = ''
复制代码


  • 说明:每种代理,都是通过两个方法来实现,先返回代理设置,然后再对代理进行具体的值设置,如下:


@property    def http_proxy(self):        """        Returns http proxy setting.        """        return self.httpProxy
@http_proxy.setter def http_proxy(self, value): """ Sets http proxy setting.
:Args: - value: The http proxy value. """ self._verify_proxy_type_compatibility(ProxyType.MANUAL) self.proxyType = ProxyType.MANUAL self.httpProxy = value
复制代码

7 实例说明

from selenium import webdriverfrom selenium.webdriver.firefox.firefox_profile import FirefoxProfile
profile = FirefoxProfile()# 设置成手动代理profile.set_preference("network.proxy.type", 1)# 设置代理ipprofile.set_preference("network.proxy.http", "ip")# 设置代理端口profile.set_preference("network.proxy.http_port", port)# 协议默认共用此ip、端口profile.set_preference("network.proxy.share_proxy_settings", True)# 启动配置driver= webdriver.Firefox(profile)# 打开浏览器driver.get('xxxxx')
复制代码


发布于: 刚刚阅读数: 3
用户头像

虫无涯

关注

专注测试领域各种技术研究、分享和交流~ 2019-12-11 加入

CSDN测试领域优质创作者 | CSDN博客专家 | 阿里云专家博主 | 华为云享专家 | 51CTO专家博主

评论

发布
暂无评论
selenium源码通读·10 |webdriver/common/proxy.py-Proxy类分析_Python_虫无涯_InfoQ写作社区