Skip to content

XXE Attack via lxml in Pandas

Description

The use of the lxml library in pandas is vulnerable to XML external entity (XXE) attacks. This can be exploited by an attacker to access sensitive data or execute system calls.

Examples

Insecure Code

python
pandas.read_html('https://example.com/vulnerable.xml')

Secure Code

python
pandas.read_html('https://example.com/vulnerable.xml', flavor='html5lib')

Remediation

Use the `html5lib` or `bs4` parser instead of `lxml` to prevent XXE attacks. For example, use `pandas.read_html(..., flavor='html5lib')` or `pandas.read_html(..., flavor='bs4')`.

Rule Details

FieldValue
IDCODE-0757
CategoryInjection
SeverityHIGH
CWECWE-611
ConfidenceHIGH
ImpactMEDIUM
LikelihoodMEDIUM
ExploitabilityMODERATE
TagsXXE, XML External Entity
OWASPN/A

References