XXE Attack via lxml in Pandas
Description
The use of the lxml library in pandas is vulnerable to XML external entity (XXE) attacks. This can be exploited by an attacker to access sensitive data or execute system calls.
Examples
Insecure Code
python
pandas.read_html('https://example.com/vulnerable.xml')Secure Code
python
pandas.read_html('https://example.com/vulnerable.xml', flavor='html5lib')Remediation
Use the `html5lib` or `bs4` parser instead of `lxml` to prevent XXE attacks. For example, use `pandas.read_html(..., flavor='html5lib')` or `pandas.read_html(..., flavor='bs4')`.
Rule Details
| Field | Value |
|---|---|
| ID | CODE-0757 |
| Category | Injection |
| Severity | HIGH |
| CWE | CWE-611 |
| Confidence | HIGH |
| Impact | MEDIUM |
| Likelihood | MEDIUM |
| Exploitability | MODERATE |
| Tags | XXE, XML External Entity |
| OWASP | N/A |