davanstrien HF staff commited on
Commit
662b961
β€’
1 Parent(s): 6d2b0a3

create card template

Browse files
Files changed (1) hide show
  1. dataset_card_template.py +40 -0
dataset_card_template.py ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ DATASET_CARD_TEMPLATE = """
2
+ # Dataset Card for {hf_repo}
3
+
4
+ ## Dataset Description
5
+
6
+ This dataset contains images converted from PDFs using the PDFs to Page Images Converter Space.
7
+
8
+ - **Number of images:** {num_images}
9
+ - **Number of PDFs processed:** {num_pdfs}
10
+ - **Sample size per PDF:** {sample_size}
11
+ - **Created on:** {creation_date}
12
+
13
+ ## Dataset Creation
14
+
15
+ ### Source Data
16
+
17
+ The images in this dataset were generated from user-uploaded PDF files.
18
+
19
+ ### Processing Steps
20
+
21
+ 1. PDF files were uploaded to the PDFs to Page Images Converter.
22
+ 2. Each PDF was processed, converting selected pages to images.
23
+ 3. The resulting images were saved and uploaded to this dataset.
24
+
25
+ ## Dataset Structure
26
+
27
+ The dataset consists of JPEG images, each representing a single page from the source PDFs.
28
+
29
+ ### Data Fields
30
+
31
+ - `images/`: A folder containing all the converted images.
32
+
33
+ ### Data Splits
34
+
35
+ This dataset does not have specific splits.
36
+
37
+ ## Additional Information
38
+
39
+ - **Contributions:** Thanks to the PDFs to Page Images Converter for creating this dataset.
40
+ """