You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Jan 7, 2021. It is now read-only.
You could recreate this hexadecimal hash yourself using the `SHA-1 algorithm <https://en.wikipedia.org/wiki/SHA-1>`_.
142
+
143
+
>>> import hashlib
144
+
>>> hashlib.sha1(obj.pdf).hexdigest()
145
+
'872b9b858f5f3e6bb6086fec7f05dd464b60eb26'
146
+
133
147
.. attribute:: document_obj.full_text
134
148
135
149
Returns the full text of the document, as extracted from the original PDF by DocumentCloud. Results may vary, but this will give you what they got. Currently, DocumentCloud only makes this available for public documents.
@@ -209,6 +217,23 @@ <h2>Metadata<a class="headerlink" href="#metadata" title="Permalink to this head
209
217
</div>
210
218
</dd></dl>
211
219
220
+
<dlclass="attribute">
221
+
<dtid="document_obj.file_hash">
222
+
<ttclass="descclassname">document_obj.</tt><ttclass="descname">file_hash</tt><aclass="headerlink" href="#document_obj.file_hash" title="Permalink to this definition">¶</a></dt>
223
+
<dd><p>A hash representation of the raw PDF data as a hexadecimal string.</p>
<p>You could recreate this hexadecimal hash yourself using the <aclass="reference external" href="https://en.wikipedia.org/wiki/SHA-1">SHA-1 algorithm</a>.</p>
<ttclass="descclassname">document_obj.</tt><ttclass="descname">full_text</tt><aclass="headerlink" href="#document_obj.full_text" title="Permalink to this definition">¶</a></dt>
@@ -395,6 +420,12 @@ <h3><a href="index.html">Table Of Contents</a></h3>
0 commit comments