###### tags: `Task` `Dataset` `Mediaad`
# Mediaad Dataset Description
[toc]
---



## Data Download
The files can be downloded from [here](https://drive.google.com/file/d/1tVYbSaG0JdxTLLPSAESnBNvj_RIxKg5B/view?usp=sharing).
## File Descriptions
- **document.csv**
Here, by document we mean a page of a website. Each row of this file is about a page metadata. The information is about the website and the publisher that the page(document) belongs to.
<!-- - *docId*
- *source(website)*
- *publisher* -->
:::info
```json=
docId,source,publisher
1,0,0
3,1,1
4,2,2
5,3,3
6,4,4
7,1,1
8,5,3
9,6,5
10,7,6
```
:::
- **document_topic.csv**
This file describes the topic distributions of pages. Each row shows a pair of a page and a topic and a confidence level. We gathered this data by preprocessing the page contents. The fields and samples are as follows:
<!-- - docId
- topicId
- Confidence -->
:::info
```json=
docId,topicId,confidence
10743259,37,0.49709868
10743259,23,0.016993675
10743259,10,0.37149945
10743259,4,0.105349235
10743258,33,0.24575157
10743258,28,0.60887676
10743258,5,0.10920001
10743257,33,0.24552396
10743257,28,0.60861677
```
:::
- **creative.csv**
This file contains the metadata about the ads we show users. Each row describes a different ad(creative), its campaign and its advertiser.
:::info
```json=
creativeId,campaignId,advertiserId
7867,5918,8414
7866,5918,8414
7865,5918,8414
7863,8343,8414
7862,8343,8414
7861,8343,8414
7860,8343,8414
7859,8343,8414
7858,8343,8414
```
:::
- **creative_title.csv**
The bag of word representation of creative titles. Each row describes a pair of a creative and q word the present in its title. ==For anonymization we present the id of words not the true values==.
:::info
```json=
creativeId,wordId
6,24
6,25
6,26
6,27
6,28
6,29
6,30
6,31
7,32
```
:::
- **creative_image.csv**
This file describes the information about the creative images. For reasons such as anonymization, fewer ram usage and ease of training, we present some extracted features for creative images. The features are a list of 512 features that are string coded.
:::info
```jsonld=
creativeId,imageFeatures
7867,"[0.0524589940905571, 0.0, 0.04262353479862213, 0.3335827887058258, 0.38356322050094604, 0.0016337124397978187, 0.34486910700798035, 0.5254691243171692, 0.05376036837697029, 0.0790339931845665, 0.5009145736694336, 0.6770318150520325, 0.0, 0.04745246469974518, 0.11314650624990463, 0.00764562888070941, 0.0, 0.1501387357711792, 0.03241937980055809, 0.004932682495564222, 0.0, 0.0, 0.5730596780776978, 0.019595881924033165, 0.0, 0.09223274141550064, 0.02514776773750782, 0.07260636240243912, 0.0, 0.5376994609832764, 0.0, 0.1540597379207611, 0.9559652805328369, 0.10546236485242844, 0.003840308403596282, 0.0, 0.09895025938749313, 0.0, 0.17928677797317505, 0.0, 0.5366105437278748, 0.2375306487083435, 0.04689086228609085, 0.8058763742446899, 0.0, 0.0, 0.01303254533559084, 1.113734483718872, 0.44265198707580566, 0.09719770401716232, 0.6938168406486511, 0.28377801179885864, 0.1759302318096161, 0.0, 0.012601906433701515, 0.3471967279911041, 0.0, 0.0766461044549942, 0.0, 0.00262973690405488, 0.0063209836371243, 0.74350905418396, 0.0, 0.0, 0.13608892261981964, 0.0, 0.24228109419345856, 0.7389951348304749, 0.0, 0.1485598236322403, 1.3823773860931396, 0.2141677439212799, 0.027074171230196953, 0.0, 0.0, 0.0, 0.26120033860206604, 0.28101813793182373, 0.0, 0.08263696730136871, 0.1578405648469925, 0.10821836441755295, 0.0, 0.0057042865082621574, 0.0, 0.019689170643687248, 0.0, 0.17585083842277527, 0.011907795444130898, 0.002454791683703661, 0.23566392064094543, 0.10202024132013321, 0.11496552079916, 0.012451911345124245, 1.870426058769226, 0.14807745814323425, 0.42014285922050476, 0.10208401829004288, 0.25753581523895264, 0.04493357986211777, 0.09856701642274857, 0.11120041459798813, 0.22483405470848083, 0.25453343987464905, 0.0032995236106216908, 0.9838165044784546, 0.8109205961227417, 0.4817812442779541, 0.014035709202289581, 0.04418061673641205, 0.011374461464583874, 0.12863709032535553, 0.03337208926677704, 0.0, 1.4399311542510986, 0.0, 0.49707287549972534, 0.2588406801223755, 0.5210369825363159, 0.0, 0.10058943182229996, 0.05463501811027527, 0.8610427975654602, 0.5020743012428284, 0.030319757759571075, 0.8928760290145874, 0.506088376045227, 0.2090473473072052, 0.003232160583138466, 1.6961923837661743, 0.028163563460111618, 0.05816756933927536, 0.35724321007728577, 1.9972378015518188, 0.4023135304450989, 0.03822699934244156, 0.47549718618392944, 0.8358854055404663, 0.442721962928772, 0.004401505459100008, 0.2167019248008728, 0.0018225357634946704, 0.11354278773069382, 0.0, 1.4138048887252808, 0.3715467154979706, 0.20260004699230194, 0.0011393165914341807, 0.24493104219436646, 0.0422644168138504, 0.07470837980508804, 0.44676506519317627, 0.6388846039772034, 0.21214638650417328, 0.9269633293151855, 0.1876787394285202, 0.0, 0.014485415071249008, 0.14804451167583466, 0.0, 0.10219316929578781, 0.0, 0.0, 0.0011359964264556766, 0.6181788444519043, 0.012399531900882721, 0.023947902023792267, 0.0002116563991876319, 0.054431620985269547, 0.2317836433649063, 0.14420899748802185, 0.6919376254081726, 0.5663913488388062, 0.01806214265525341, 0.0, 0.0, 0.8541684746742249, 0.0, 0.04821351170539856, 0.0, 0.13298183679580688, 0.26571476459503174, 0.2441360503435135, 0.11967230588197708, 0.43411985039711, 0.6160688400268555, 0.19833825528621674, 0.0, 0.0, 0.06707854568958282, 0.024971941486001015, 0.9891048669815063, 0.0, 0.1065264344215393, 0.09542371332645416, 0.6286377310752869, 0.06564754992723465, 0.9165804982185364, 0.022102609276771545, 0.01808992587029934, 0.0713435634970665, 0.004327212926000357, 0.18033963441848755, 0.2409106194972992, 0.19836527109146118, 0.0, 0.0, 0.0831923633813858, 0.1496896892786026, 0.14142830669879913, 0.2097349762916565, 0.0, 0.4512734115123749, 0.8316828608512878, 0.025351183488965034, 0.7340601682662964, 0.0, 0.1781339794397354, 0.11174163967370987, 0.08710390329360962, 0.18852172791957855, 0.02282477356493473, 0.10788831859827042, 0.04018762335181236, 0.6255924701690674, 0.09888948500156403, 0.3072219491004944, 0.5751482248306274, 0.0, 0.25533074140548706, 0.0, 0.142370343208313, 0.36775970458984375, 0.0012231525033712387, 0.0, 0.09687993675470352, 0.42802900075912476, 0.04800980165600777, 0.0, 0.05185330659151077, 0.4435133934020996, 0.0, 0.0, 0.045000672340393066, 0.0, 0.010632830671966076, 0.14304840564727783, 0.4098302125930786, 0.046728916466236115, 0.03777185082435608, 1.3611832857131958, 2.5127615928649902, 0.1033560261130333, 0.010225065983831882, 0.28524771332740784, 0.11239043623209, 0.7252575755119324, 0.8900536298751831, 0.0, 0.38185858726501465, 0.06111784651875496, 0.0, 0.040989700704813004, 0.0, 0.008436067961156368, 0.01430627889931202, 0.36651307344436646, 0.13438716530799866, 0.0193697027862072, 0.0, 0.22299551963806152, 0.03810221701860428, 0.0777783989906311, 0.01994059421122074, 0.045812010765075684, 0.0, 0.4665735065937042, 0.0, 0.277442067861557, 0.8755587339401245, 0.2046937644481659, 0.050224144011735916, 0.1398463100194931, 0.062250956892967224, 0.03436265140771866, 0.21302521228790283, 0.1241559311747551, 1.8673889636993408, 0.38291168212890625, 0.011038499884307384, 0.06205926835536957, 0.02232370525598526, 0.0, 0.0, 0.041719697415828705, 0.04538219794631004, 0.2916184961795807, 0.049691107124090195, 0.5081413984298706, 0.11252861469984055, 0.3664450943470001, 0.531741201877594, 0.11930695176124573, 0.5198532938957214, 0.8124033212661743, 0.4518541097640991, 0.21484598517417908, 0.6128281354904175, 0.0, 0.3964100480079651, 0.9020141363143921, 0.18474553525447845, 0.038612764328718185, 0.0, 0.0, 1.7949954271316528, 0.1832423210144043, 0.0, 0.15385232865810394, 1.3681594133377075, 0.15732046961784363, 0.04315795749425888, 0.0, 0.0, 0.06659245491027832, 0.010794873349368572, 0.0, 0.16922619938850403, 0.004929518327116966, 0.09100309759378433, 0.0, 0.0008615456172265112, 0.0, 0.014330863952636719, 0.3634317219257355, 0.5796082615852356, 0.004235472530126572, 0.0866054818034172, 0.19918595254421234, 0.0, 0.012094324454665184, 0.008820407092571259, 0.0003866589686367661, 0.07019684463739395, 0.37010496854782104, 0.051394619047641754, 0.18688522279262543, 0.06396914273500443, 0.39068594574928284, 0.1060214713215828, 0.010107624344527721, 0.0, 0.0, 0.022860363125801086, 0.5025042295455933, 0.5866639018058777, 0.9520065784454346, 0.08304600417613983, 0.38530126214027405, 0.27544865012168884, 0.007508657407015562, 0.26225095987319946, 0.0, 0.8277463912963867, 0.0, 0.793016791343689, 0.2730690538883209, 0.0, 0.13819585740566254, 0.027830492705106735, 0.18068702518939972, 2.417189836502075, 0.0, 1.1181780099868774, 0.0, 1.9062105417251587, 0.08524265885353088, 0.0, 0.0815972313284874, 1.2653571367263794, 0.0, 0.6985716223716736, 0.0, 0.044726401567459106, 0.12374112010002136, 0.23629596829414368, 0.14078129827976227, 0.47028273344039917, 0.0, 0.2854258716106415, 0.5492580533027649, 0.31275811791419983, 0.15689019858837128, 0.0, 0.1441369354724884, 0.0071984343230724335, 0.609931230545044, 0.14004181325435638, 0.585576057434082, 0.0, 0.22358006238937378, 0.25531983375549316, 1.235224962234497, 2.5459024906158447, 0.12151942402124405, 0.04073060303926468, 0.0037157435435801744, 0.1867835372686386, 0.5576404333114624, 0.0, 0.26616135239601135, 0.01987653411924839, 0.0, 0.15705813467502594, 0.17579331994056702, 0.23685690760612488, 0.0, 0.08871358633041382, 0.11719027161598206, 0.05135223641991615, 0.00845278985798359, 0.0, 0.0011314277071505785, 0.0954841673374176, 0.0, 0.29563120007514954, 0.0, 0.5004280209541321, 0.15429893136024475, 1.0944734811782837, 1.399240255355835, 0.0, 0.024041304364800453, 0.0, 0.02393883839249611, 0.0, 0.11461999267339706, 0.0, 0.32238876819610596, 0.11020731180906296, 0.0, 0.2500244081020355, 0.0845474824309349, 0.0, 0.47124549746513367, 0.10645349323749542, 0.9488299489021301, 0.5220524072647095, 0.01650909334421158, 0.8671882748603821, 0.0, 0.02164115197956562, 0.4217461347579956, 0.016102958470582962, 0.17660780251026154, 0.23909291625022888, 0.0053427438251674175, 0.1923588663339615, 0.1154368445277214, 0.0, 0.10403607785701752, 0.0, 0.10917922109365463, 0.22670328617095947, 1.838025450706482, 0.1935068964958191, 0.4185929000377655, 0.0, 0.30573517084121704, 0.01434508990496397, 0.0, 0.10782365500926971, 0.0, 0.0, 1.0141878128051758, 0.14969369769096375, 0.10680084675550461, 0.16886450350284576, 0.019656404852867126, 0.08190245181322098, 0.33918142318725586, 0.13427631556987762, 0.01647268421947956, 0.19352316856384277, 0.4094108045101166, 0.12720051407814026, 0.4061034321784973, 0.0, 0.0, 0.007430760655552149, 0.2954533100128174, 0.34858548641204834, 0.11279997229576111, 0.0, 0.2815641760826111, 0.04496641457080841, 0.14596092700958252, 0.0, 0.14465385675430298, 0.09087847173213959, 0.07650305330753326, 0.22364288568496704, 0.23372770845890045, 0.5394803881645203, 0.15250787138938904, 0.34451788663864136, 0.7787312865257263, 0.07829540222883224, 0.16016195714473724, 0.0, 0.3051452338695526, 0.018119463697075844]"
```
:::
- **user_page_view.csv**
It is the log of users visiting documents. Here, by document we mean the page that user visits. Each row belongs to a page visit with the follwing fields:
<!-- - userId
- docId
- timetamp -->
:::info
```json=
userId,docId,timestamp
821961,8116,1579599211445
15321,9533442,1579599211443
1125090,9410379,1579599211440
407101,8616213,1579599211429
781615,9543366,1579599211429
1210398,9395724,1579599211426
470110,9546771,1579599211423
767395,9523927,1579599211423
630846,9531785,1579599211409
```
:::
- **event.csv**
This file contains the information about the **Context** in which the ad is shown to the user. The *displayId* is an Id that unifies the display and can link this contextual information to the click data. In each display multiple ads are shown to the user and one or none of them is clicked. *widgetId* is the id of the widget that the ads is shown to user in it. Each page may contain multiple widgets and in each widget multiple ads are shown to the user.
A sample page with multiple widgets:
Each row describes the context of a display in which some ads are shown to a user:
:::info
```json=
displayId,timestamp,docId,widgetId,userId,device,OS,browser
4706262,1578429005696,3543873,6262,2688642,0,0,0
4706267,1578429007726,6245475,607,2688641,1,3,0
4706260,1578429012060,4416499,11458,2688638,0,0,1
4706255,1578429017218,6246028,9358,1962852,0,0,0
4706256,1578429021388,5327047,9358,2687719,0,0,0
4706235,1578429023501,6245689,9358,2688634,0,0,0
4706181,1578429029640,83625,5802,2688629,0,0,5
4706228,1578429035793,4096624,11458,745082,1,1,4
4706234,1578429036317,6244430,7691,2688626,0,0,0
```
:::
- click_train.csv
This file shows for each displayId the user clicked on which ad. For each displayId there is multiple entries, but one of them is clicked. You can gather the display context information by joining this data with **event.csv** data. This data is the main data for learning the classifier to predict the click events.
:::info
```json=
displayId,creativeId,clicked
1210227,7182,0
1210227,7125,0
1210227,7181,0
1210227,535,0
1210227,7174,1
1214987,7125,0
1214987,7092,1
1214987,5652,0
1248098,7182,0
```
:::
- click_test.csv
This file is the same ad **click_train.csv** but the clicked column is removed. This data is for evaluating your predictor and your predictor should guess the click label for this data. We evaluate your performance by this data.
:::info
```json=
displayId,creativeId
151650,7585
151650,6257
151650,6690
151938,7454
151938,7370
151938,123
151938,6690
151938,7715
151938,7379
```
:::
{"metaMigratedAt":"2023-06-15T11:33:56.654Z","metaMigratedFrom":"Content","title":"Mediaad Dataset Description","breaks":true,"contributors":"[{\"id\":\"26ae0d15-d198-4ce2-aad2-1d1cee820a88\",\"add\":14173,\"del\":373}]"}