# Perform Foundational Data, ML, and AI Tasks in Google Cloud
**Vertex AI: Qwik Start**
:::success
**Insight**
1. Enable Google Cloud services
2. Create Vertex AI custom service account for Vertex Tensorboard integration
3. Launch Vertex AI Workbench notebook
4. Clone the lab repository
5. Install lab dependencies
:::
---
**Dataprep: Qwik Start**
:::success
**Insight**
1. Create a Cloud Storage bucket in your project
2. Initialize Cloud Dataprep
3. Create a flow
4. Import datasets
5. Prep the candidate file
6. Wrangle the Contributions file and join it to the Candidates file
7. Summary of data
8. Rename columns
:::
---
**Dataflow: Qwik Start - Templates**
:::success
**Insight**
1. Create a Cloud BigQuery dataset and table Using Cloud Shell
2. Create a Cloud BigQuery dataset and table using the Cloud Console
3. Run the pipeline
4. Submit a query
:::
* Google Cloud Dataflow supports batch processing. True
* Which Dataflow Template used in the lab to run the pipeline? Pub/Sub to BigQuery
---
**Dataflow: Qwik Start - Python**
:::success
**Insight**
1. Create a Cloud Storage bucket
2. Install pip and the Cloud Dataflow SDK
3. Run an example pipeline remotely
4. Check that your job succeeded
:::
* Dataflow temp_location must be a valid Cloud Storage URL. True
---
**Dataproc: Qwik Start - Console**
:::success
**Insight**
1. Create a cluster
2. Submit a job
3. View the job output
4. Update a cluster
:::
* Which type of Dataproc job is submitted in the lab? Spark
* Dataproc helps users process, transform and understand vast quantities of data. True
---
**Dataproc: Qwik Start - Command Line**
:::success
**Insight**
1. Create a cluster
2. Submit a job
3. Update a cluster
:::
* Clusters can be created and scaled quickly with a variety of virtual machine types, disk sizes, and number of nodes. True
---
**Cloud Natural Language API: Qwik Start**
:::success
**Insight**
1. Create an API key
2. Make an entity analysis request
:::
---
**Google Cloud Speech API: Qwik Start**
:::success
**Insight**
1. Create an API key
2. Create your Speech API request
3. Call the Speech API
:::
---
**Video Intelligence: Qwik Start**
:::success
**Insight**
1. Enable the Video Intelligence API
2. Set up authorization
3. Make an annotate video request
:::
---
**Perform Foundational Data, ML, and AI Tasks in Google Cloud: Challenge Lab**
[Challenge Lab](https://medium.com/@adhwaithchandrann00b/perform-foundational-data-ml-and-ai-tasks-in-google-cloud-challenge-lab-b7cc723ae1e8)
Add the following info from your Quest
```
REGION=
Dataset=
TABLE=
TASK_3=
TASK_4=
PROJECT_ID=$(gcloud config get-value project)
target=$Dataset.$TABLE
bucket_name=$PROJECT_ID-marking
bq mk $Dataset
gsutil mb gs://$bucket_name
```
```
cat > table.py <<EOF
from google.cloud import bigquery
# Construct a BigQuery client object.
client = bigquery.Client()
table_id = "$PROJECT_ID.$Dataset.TABLE"
schema = [
bigquery.SchemaField("guid", "STRING", mode="NULLABLE"),
bigquery.SchemaField("isActive", "BOOLEAN", mode="NULLABLE"),
bigquery.SchemaField("firstname", "STRING", mode="NULLABLE"),
bigquery.SchemaField("surname", "STRING", mode="NULLABLE"),
bigquery.SchemaField("company", "STRING", mode="NULLABLE"),
bigquery.SchemaField("email", "STRING", mode="NULLABLE"),
bigquery.SchemaField("phone", "STRING", mode="NULLABLE"),
bigquery.SchemaField("address", "STRING", mode="NULLABLE"),
bigquery.SchemaField("about", "STRING", mode="NULLABLE"),
bigquery.SchemaField("registered", "TIMESTAMP", mode="NULLABLE"),
bigquery.SchemaField("latitude", "FLOAT", mode="NULLABLE"),
bigquery.SchemaField("longitude", "FLOAT", mode="NULLABLE"),
]
table = bigquery.Table(table_id, schema=schema)
table = client.create_table(table) # Make an API request.
print(
"Created table {}.{}.{}".format(table.project, table.dataset_id, table.table_id)
)
EOF
```
```
python3 table.py
gcloud dataflow jobs run lab-transform --gcs-location gs://dataflow-templates-$REGION/latest/GCS_Text_to_BigQuery --worker-machine-type e2-standard-2 --region $REGION --staging-location gs://$PROJECT_ID-marking/temp --parameters javascriptTextTransformGcsPath=gs://cloud-training/gsp323/lab.js,JSONPath=gs://cloud-training/gsp323/lab.schema,javascriptTextTransformFunctionName=transform,outputTable=$PROJECT_ID:$Dataset.$TABLE,inputFilePattern=gs://cloud-training/gsp323/lab.csv,bigQueryLoadingTemporaryDirectory=gs://$PROJECT_ID-marking/bigquery_temp
```
```
gcloud dataproc clusters create cluster-f357 --region $REGION --zone $REGION-a --master-machine-type e2-standard-2 --master-boot-disk-size 500 --num-workers 2 --worker-machine-type e2-standard-2 --worker-boot-disk-size 500 --image-version 2.0-debian10 --project $PROJECT_ID
```
```
gcloud beta compute ssh cluster-f357-w-0 -- -vvv
```
```
hdfs dfs -cp gs://cloud-training/gsp323/data.txt /data.txt
exit
```
```
gcloud config set dataproc/region $REGION
gcloud dataproc jobs submit spark --cluster cluster-f357 \
--class org.apache.spark.examples.SparkPageRank \
--cluster=cluster-f357 \
--jars file:///usr/lib/spark/examples/jars/spark-examples.jar -- /data.txt
```
```
gcloud services enable apikeys.googleapis.com
gcloud alpha services api-keys create --display-name="testname"
KEY_NAME=$(gcloud alpha services api-keys list --format="value(name)" --filter "displayName=testname")
API_KEY=$(gcloud alpha services api-keys get-key-string $KEY_NAME --format="value(keyString)")
echo $API_KEY
```
```
gcloud iam service-accounts create techvine \
--display-name "my natural language service account"
gcloud iam service-accounts keys create ~/key.json \
--iam-account techvine@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com
export GOOGLE_APPLICATION_CREDENTIALS="/home/$USER/key.json"
gcloud auth activate-service-account techvine@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com --key-file=$GOOGLE_APPLICATION_CREDENTIALS
gcloud ml language analyze-entities --content="Old Norse texts portray Odin as one-eyed and long-bearded, frequently wielding a spear named Gungnir and wearing a cloak and a broad hat." > result.json
gcloud auth login --no-launch-browser
```
And we need to click the link from the output. Login. And copy the url. Paste into the commandline.
```
gsutil cp result.json $TASK_4
cat > request.json <<EOF
{
"config": {
"encoding":"FLAC",
"languageCode": "en-US"
},
"audio": {
"uri":"gs://cloud-training/gsp323/task3.flac"
}
}
EOF
curl -s -X POST -H "Content-Type: application/json" --data-binary @request.json \
"https://speech.googleapis.com/v1/speech:recognize?key=${API_KEY}" > result.json
gsutil cp result.json $TASK_3
gcloud iam service-accounts create quickstart
gcloud iam service-accounts keys create key.json --iam-account quickstart@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com
gcloud auth activate-service-account --key-file key.json
export ACCESS_TOKEN=$(gcloud auth print-access-token)
cat > request.json <<EOF
{
"inputUri":"gs://spls/gsp154/video/train.mp4",
"features": [
"TEXT_DETECTION"
]
}
EOF
```
```
curl -s -H 'Content-Type: application/json' \
-H "Authorization: Bearer $ACCESS_TOKEN" \
'https://videointelligence.googleapis.com/v1/videos:annotate' \
-d @request.json
curl -s -H 'Content-Type: application/json' -H "Authorization: Bearer $ACCESS_TOKEN" 'https://videointelligence.googleapis.com/v1/operations/OPERATION_FROM_PREVIOUS_REQUEST' > result1.json
```