HEX
Server: Apache
System: Linux zacp120.webway.host 4.18.0-553.50.1.lve.el8.x86_64 #1 SMP Thu Apr 17 19:10:24 UTC 2025 x86_64
User: govancoz (1003)
PHP: 8.3.26
Disabled: exec,system,passthru,shell_exec,proc_close,proc_open,dl,popen,show_source,posix_kill,posix_mkfifo,posix_getpwuid,posix_setpgid,posix_setsid,posix_setuid,posix_setgid,posix_seteuid,posix_setegid,posix_uname
Upload Files
File: //opt/alt/python37/lib/python3.7/site-packages/charset_normalizer/__pycache__/api.cpython-37.pyc
B

�bOO�@s�ddlZddlmZmZddlmZmZmZmZyddl	m
Z
Wnek
rXeZ
YnXddl
mZmZmZmZddlmZmZmZmZddlmZdd	lmZmZdd
lmZmZmZm Z m!Z!m"Z"e�#d�Z$e�%�Z&e&�'e�(d��de)e*e*e+eeeee,e,ed�	dd�Z-dee*e*e+eeeee,e,ed�	dd�Z.d e
e*e*e+eeeee,e,ed�	dd�Z/d!e
e*e*e+eeeee,ed�dd�Z0dS)"�N)�basename�splitext)�BinaryIO�List�Optional�Set)�PathLike�)�coherence_ratio�encoding_languages�mb_encoding_languages�merge_coherence_ratios)�IANA_SUPPORTED�TOO_BIG_SEQUENCE�TOO_SMALL_SEQUENCE�TRACE)�
mess_ratio)�CharsetMatch�CharsetMatches)�any_specified_encoding�	iana_name�identify_sig_or_bom�
is_cp_similar�is_multi_byte_encoding�should_strip_sig_or_bom�charset_normalizerz)%(asctime)s | %(levelname)s | %(message)s��皙�����?TF)	�	sequences�steps�
chunk_size�	threshold�cp_isolation�cp_exclusion�preemptive_behaviour�explain�returnc1Cs�	t|ttf�s td�t|����|r>tj}t�t	�t�
t�t|�}	|	dkr�t�
d�|rvt�t	�t�
|prtj�tt|dddgd�g�S|dk	r�t�td	d
�|��dd�|D�}ng}|dk	r�t�td
d
�|��dd�|D�}ng}|	||k�rt�td|||	�d}|	}|dk�r:|	||k�r:t|	|�}t|�tk}
t|�tk}|
�rlt�td�|	��n|�r�t�td�|	��g}|�r�t|�nd}
|
dk	�r�|�|
�t�td|
�t�}g}g}d}d}d}t�}t|�\}}|dk	�r|�|�t�tdt|�|�|�d�d|k�r.|�d��xn|tD�]`}|�rT||k�rT�q:|�rh||k�rh�q:||k�rv�q:|�|�d}||k}|�o�t|�}|dk�r�|�s�t�td|��q:yt|�}Wn,t t!fk
�r�t�td|��w:YnXyr|�r@|dk�r@t"|dk�r$|dtd��n|t|�td��|d�n&t"|dk�rP|n|t|�d�|d�}WnVt#t$fk
�r�}z2t|t$��s�t�td|t"|��|�|��w:Wdd}~XYnXd}x |D]}t%||��r�d}P�q�W|�rt�td||��q:t&|�sdnt|�|	t|	|��}|�o<|dk	�o<t|�|	k} | �rRt�td|�tt|�d�}!t'|!d �}!d}"d}#g}$g}%�x�|D�]�}&|&||	d!k�r��q�||&|&|�}'|�r�|dk�r�||'}'y|'j(||�r�d"nd#d$�}(WnBt#k
�r&}z"t�td%|t"|��|!}"d}#PWdd}~XYnX|�r�|&dk�r�||&d&k�r�t)|d'�})|�r�|(d|)�|k�r�xdt&|&|&dd(�D]P}*||*|&|�}'|�r�|dk�r�||'}'|'j(|d"d$�}(|(d|)�|k�rzP�qzW|$�|(�|%�t*|(|��|%d(|k�r�|"d7}"|"|!k�s|�r�|dk�r�P�q�W|#�s�|�r�|�s�y|td)�d�j(|d#d$�WnFt#k
�r�}z&t�td*|t"|��|�|��w:Wdd}~XYnX|%�r�t+|%�t|%�nd}+|+|k�s�|"|!k�r>|�|�t�td+||"t,|+d,d-d.��|dd|
gk�r:|#�s:t|||dg|�},||
k�r&|,}n|dk�r6|,}n|,}�q:t�td/|t,|+d,d-d.��|�sjt-|�}-nt.|�}-|-�r�t�td0�|t"|-���g}.|dk�r�x4|$D],}(t/|(d1|-�r�d2�|-�nd�}/|.�|/��q�Wt0|.�}0|0�r�t�td3�|0|��|�t|||+||0|��||
ddgk�r\|+d1k�r\t�
d4|�|�rNt�t	�t�
|�t||g�S||k�r:t�
d5|�|�r�t�t	�t�
|�t||g�S�q:Wt|�dk�	rP|�s�|�s�|�r�t�td6�|�r�t�
d7|j1�|�|�nd|�r�|dk�	s |�	r|�	r|j2|j2k�	s |dk	�	r6t�
d8�|�|�n|�	rPt�
d9�|�|�|�	rtt�
d:|�3�j1t|�d�n
t�
d;�|�	r�t�t	�t�
|�|S)<ae
    Given a raw bytes sequence, return the best possibles charset usable to render str objects.
    If there is no results, it is a strong indicator that the source is binary/not text.
    By default, the process will extract 5 blocs of 512o each to assess the mess and coherence of a given sequence.
    And will give up a particular code page after 20% of measured mess. Those criteria are customizable at will.

    The preemptive behavior DOES NOT replace the traditional detection workflow, it prioritize a particular code page
    but never take it for granted. Can improve the performance.

    You may want to focus your attention to some code page or/and not others, use cp_isolation and cp_exclusion for that
    purpose.

    This function will strip the SIG in the payload/sequence every time except on UTF-16, UTF-32.
    By default the library does not setup any handler other than the NullHandler, if you choose to set the 'explain'
    toggle to True it will alter the logger configuration to add a StreamHandler that is suitable for debugging.
    Custom logging format and handler can be set manually.
    z4Expected object of type bytes or bytearray, got: {0}rz<Encoding detection on empty bytes, assuming utf_8 intention.�utf_8gF�Nz`cp_isolation is set. use this flag for debugging purpose. limited list of encoding allowed : %s.z, cSsg|]}t|d��qS)F)r)�.0�cp�r,�G/opt/alt/python37/lib/python3.7/site-packages/charset_normalizer/api.py�
<listcomp>]szfrom_bytes.<locals>.<listcomp>zacp_exclusion is set. use this flag for debugging purpose. limited list of encoding excluded : %s.cSsg|]}t|d��qS)F)r)r*r+r,r,r-r.hsz^override steps (%i) and chunk_size (%i) as content does not fit (%i byte(s) given) parameters.r	z>Trying to detect encoding from a tiny portion of ({}) byte(s).zIUsing lazy str decoding because the payload is quite large, ({}) byte(s).z@Detected declarative mark in sequence. Priority +1 given for %s.zIDetected a SIG or BOM mark on first %i byte(s). Priority +1 given for %s.�ascii>�utf_32�utf_16z[Encoding %s wont be tested as-is because it require a BOM. Will try some sub-encoder LE/BE.z2Encoding %s does not provide an IncrementalDecoderg��A)�encodingz9Code page %s does not fit given bytes sequence at ALL. %sTzW%s is deemed too similar to code page %s and was consider unsuited already. Continuing!zpCode page %s is a multi byte encoding table and it appear that at least one character was encoded using n-bytes.����ignore�strict)�errorszaLazyStr Loading: After MD chunk decode, code page %s does not fit given bytes sequence at ALL. %s�����gj�@z^LazyStr Loading: After final lookup, code page %s does not fit given bytes sequence at ALL. %szc%s was excluded because of initial chaos probing. Gave up %i time(s). Computed mean chaos is %f %%.�d�)�ndigitsz=%s passed initial chaos probing. Mean measured chaos is %f %%z&{} should target any language(s) of {}g�������?�,z We detected language {} using {}z.Encoding detection: %s is most likely the one.zoEncoding detection: %s is most likely the one as we detected a BOM or SIG within the beginning of the sequence.zONothing got out of the detection process. Using ASCII/UTF-8/Specified fallback.z7Encoding detection: %s will be used as a fallback matchz:Encoding detection: utf_8 will be used as a fallback matchz:Encoding detection: ascii will be used as a fallback matchz]Encoding detection: Found %s as plausible (best-candidate) for content. With %i alternatives.z=Encoding detection: Unable to determine any suitable charset.)4�
isinstance�	bytearray�bytes�	TypeError�format�type�logger�level�
addHandler�explain_handler�setLevelr�len�debug�
removeHandler�logging�WARNINGrr�log�join�intrrr�append�setrr�addrr�ModuleNotFoundError�ImportError�str�UnicodeDecodeError�LookupErrorr�range�max�decode�minr�sum�roundrrr
r
r2Zfingerprint�best)1rr r!r"r#r$r%r&Zprevious_logger_level�lengthZis_too_small_sequenceZis_too_large_sequenceZprioritized_encodingsZspecified_encodingZtestedZtested_but_hard_failureZtested_but_soft_failureZfallback_asciiZfallback_u8Zfallback_specified�resultsZsig_encodingZsig_payloadZ
encoding_ianaZdecoded_payloadZbom_or_sig_availableZstrip_sig_or_bomZis_multi_byte_decoder�eZsimilar_soft_failure_testZencoding_soft_failedZr_Zmulti_byte_bonusZmax_chunk_gave_upZearly_stop_countZlazy_str_hard_failureZ	md_chunksZ	md_ratios�iZcut_sequence�chunkZchunk_partial_size_chk�jZmean_mess_ratioZfallback_entryZtarget_languagesZ	cd_ratiosZchunk_languagesZcd_ratios_mergedr,r,r-�
from_bytes%sZ














































rh)	�fpr r!r"r#r$r%r&r'c	Cst|��|||||||�S)z�
    Same thing than the function from_bytes but using a file pointer that is already ready.
    Will not close the file pointer.
    )rh�read)rir r!r"r#r$r%r&r,r,r-�from_fpsrk)	�pathr r!r"r#r$r%r&r'c	
Cs,t|d��}t||||||||�SQRXdS)z�
    Same thing than the function from_bytes but with one extra step. Opening and reading given file path in binary mode.
    Can raise IOError.
    �rbN)�openrk)	rlr r!r"r#r$r%r&rir,r,r-�	from_pathsro)rlr r!r"r#r$r%r'c	Cs�t|||||||�}t|�}tt|��}	t|�dkrBtd�|���|��}
|	dd|
j7<t	d�t
|��|d�|	���d��}|�
|
���WdQRX|
S)zi
    Take a (text-based) file path and try to create another file next to it, this time using UTF-8.
    rz;Unable to normalize "{}", no encoding charset seems to fit.�-z{}r)�wbN)ror�listrrK�IOErrorrDrar2rnrX�replacerQ�write�output)rlr r!r"r#r$r%rc�filenameZtarget_extensions�resultrir,r,r-�	normalize7s* ry)rrrNNTF)rrrNNTF)rrrNNTF)rrrNNT)1rN�os.pathrr�typingrrrr�osrrWrXZcdr
rrr
ZconstantrrrrZmdrZmodelsrr�utilsrrrrrr�	getLoggerrF�
StreamHandlerrI�setFormatter�	FormatterrBrR�float�boolrhrkroryr,r,r,r-�<module>sb
 
Y