One place for hosting & domains

      Files

      How To Build a GraphQL API With Golang to Upload Files to DigitalOcean Spaces


      The author selected the Diversity in Tech Fund to receive a donation as part of the Write for DOnations program.

      Introduction

      For many applications, one desirable feature is the user’s ability to upload a profile image. However, building this feature can be a challenge for developers new to GraphQL, which has no built-in support for file uploads.

      In this tutorial, you will learn to upload images to a third-party storage service directly from your backend application. You will build a GraphQL API that uses an S3-compatible AWS GO SDK from a Go backend application to upload images to DigitalOcean Spaces, which is a highly scalable object storage service. The Go back-end application will expose a GraphQL API and store user data in a PotsgreSQL database provided by DigitalOcean’s Managed Databases service.

      By the end of this tutorial, you will have built a GraphQL API using Golang that can receive a media file from a multipart HTTP request and upload the file to a bucket within DigitalOcean Spaces.

      Prerequisites

      To follow this tutorial, you will need:

      • A DigitalOcean account. If you do not have one, sign up for a new account. You will use DigitalOcean’s Spaces and Managed Databases in this tutorial.

      • A DigitalOcean Space with Access Key and Access Secret, which you can create by following the tutorial, How To Create A DigitalOcean Space and API Key. You can also see product documentation for How to Manage Administrative Access to Spaces.

      • Go installed on your local machine, which you can do by following our series, How to Install and Set Up a Local Programming Environment for Go. This tutorial used Go version 1.17.1.

      • Basic knowledge of Golang, which you can gain from our How To Code in Go series. The tutorial, How To Write Your First Program In Go, provides a good introduction to the Golang programming language.

      • An understanding of GraphQL, which you can find in our tutorial, An Introduction To GraphQL.

      Step 1 — Bootstrapping a Golang GraphQL API

      In this step, you will use the Gqlgen library to bootstrap the GraphQL API. Gqlgen is a Go library for building GraphQL APIs. Two important features that Gqglen provides are a schema-first approach and code generation. With a schema-first approach, you first define the data model for the API using the GraphQL Schema Definition Language (SDL). Then you generate the boilerplate code for the API from the defined schema. Using the code generation feature, you do not need to manually create the query and mutation resolvers for the API as they are automatically generated.

      To get started, execute the command below to install gqlgen:

      • go install github.com/99designs/gqlgen@latest

      Next, create a project directory named digitalocean to store the files for this project:

      Change into the digitalocean project directory:

      From your project directory, run the following command to create a go.mod file that manages the modules within the digitalocean project:

      Next, using nano or your favorite text editor, create a file named tools.go within the project directory:

      Add the following lines into the tools.go file as a tool for the project:

      // +build tools
      
       package tools
      
       import _ "github.com/99designs/gqlgen" 
      

      Next, execute the tidy command to install the gqlgen dependency introduced within the tools.go file:

      Finally, using the installed Gqlgen library, generate the boilerplate files needed for the GraphQL API:

      Running the gqlgen command above generates a server.go file for running the GraphQL server and a graph directory containing a schema.graphqls file that contains the Schema Definitions for the GraphQL API.

      In this step, you used the Gqlgen library to bootstrap the GraphQL API. Next, you’ll define the schema of the GraphQL application.

      Step 2 — Defining the GraphQL Application Schema

      In this step, you will define the schema of the GraphQL application by modifying the schema.graphqls file that was automatically generated when you ran the gqlgen init command. In this file, you will define a User, Query, and Mutation types.

      Navigate to the graph directory and open the schema.graphqls file, which defines the schema of the GraphQL application. Replace the boilerplate schema with the following code block, which defines the User type with a Query to retrieve all user data and a Mutation to insert data:

      schema.graphqls

      
      scalar Upload
      
      type User {
        id: ID!
        fullName: String!
        email: String!
        img_uri: String!
        DateCreated: String!
      }
      
      type Query {
        users: [User]!
      }
      
      input NewUser {
        fullName: String!
        email: String!
        img_uri: String
        DateCreated: String
      }
      
      input ProfileImage {
        userId: String
        file: Upload
      }
      
      type Mutation {
        createUser(input: NewUser!): User!
        uploadProfileImage(input: ProfileImage!): Boolean!
      }
      

      The code block defines two Mutation types and a single Query type for retrieving all users. A mutation is used to insert or mutate existing data in a GraphQL application, while a query is used to fetch data, similar to the GET HTTP verb in a REST API.

      The schema in the code block above used the GraphQL Schema Definition Language to define a Mutation containing the CreateUser type, which accepts the NewUser input as a parameter and returns a single user. It also contains the uploadProfileImage type, which accepts the ProfileImage and returns a boolean value to indicate the status of the success upload operation.

      Note: Gqlgen automatically defines the Upload scalar type, and it defines the properties of a file. To use it, you only need to declare it at the top of the schema file, as it was done in the code block above.

      At this point, you have defined the structure of the data model for the application. The next step is to generate the schema’s query and the mutation resolver functions using Gqlgen’s code generation feature.

      Step 3 — Generating the Application Resolvers

      In this step, you will use Gqlgen’s code generation feature to automatically generate the GraphQL resolvers based on the schema that you created in the previous step. A resolver is a function that resolves or returns a value for a GraphQL field. This value could be an object or a scalar type such as a string, number, or even a boolean.

      The Gqlgen package is based on a schema-first approach. A time-saving feature of Gqlgen is its ability to generate your application’s resolvers based on your defined schema in the schema.graphqls file. With this feature, you do not need to manually write the resolver boilerplate code, which means you can focus on implementing the defined resolvers.

      To use the code generation feature, execute the command below in the project directory to generate the GraphQL API model files and resolvers:

      A few things will happen after executing the gqlgen command. Two validation errors relating to the schema.resolvers.go file will be printed out, some new files will be generated, and your project will have a new folder structure.

      Execute the tree command to view the new files added to your project.

      tree *
      

      The current directory structure will look similar to this:

      Output

      go.mod go.sum gqlgen.yml graph ├── db.go ├── generated │   └── generated.go ├── model │   └── models_gen.go ├── resolver.go ├── schema.graphqls └── schema.resolvers.go server.go tmp ├── build-errors.log └── main tools.go 2 directories, 8 files

      Among the project files, one important file is schema.resolvers.go. It contains methods that implement the Mutation and Query types previously defined in the schema.graphqls file.

      To fix the validation errors, delete the CreateTodo and Todos methods at the bottom of the schema.resolvers.go file. Gqlgen moved the methods to the bottom of the file because the type definitions were changed in the schema.graphqls file.

      schema.resolvers.go

      
      package graph
      
      // This file will be automatically regenerated based on the schema, any resolver implementations
      // will be copied through when generating and any unknown code will be moved to the end.
      
      import (
          "context"
          "digitalocean/graph/generated"
          "digitalocean/graph/model"
          "fmt"
      )
      
      func (r *mutationResolver) CreateUser(ctx context.Context, input model.NewUser) (*model.User, error) {
          panic(fmt.Errorf("not implemented"))
      }
      
      func (r *mutationResolver) UploadProfileImage(ctx context.Context, input model.ProfileImage) (bool, error) {
          panic(fmt.Errorf("not implemented"))
      }
      
      func (r *queryResolver) User(ctx context.Context) (*model.User, error) {
          panic(fmt.Errorf("not implemented"))
      }
      
      // Mutation returns generated.MutationResolver implementation.
      func (r *Resolver) Mutation() generated.MutationResolver { return &mutationResolver{r} }
      
      // Query returns generated.QueryResolver implementation.
      func (r *Resolver) Query() generated.QueryResolver { return &queryResolver{r} }
      
      type mutationResolver struct{ *Resolver }
      type queryResolver struct{ *Resolver }
      
      // !!! WARNING !!!
      // The code below was going to be deleted when updating resolvers. It has been copied here so you have
      // one last chance to move it out of harms way if you want. There are two reasons this happens:
      //  - When renaming or deleting a resolver the old code will be put in here. You can safely delete
      //    it when you're done.
      //  - You have helper methods in this file. Move them out to keep these resolver files clean.
      
      func (r *mutationResolver) CreateTodo(ctx context.Context, input model.NewTodo) (*model.Todo, error) {
       panic(fmt.Errorf("not implemented"))
      }
      func (r *queryResolver) Todos(ctx context.Context) ([]*model.Todo, error) {
       panic(fmt.Errorf("not implemented"))
      }
      

      As defined in the schema.graphqls file, Gqlgen’s code generator created two mutations and one query resolver method. These resolvers serve the following purposes:

      • CreateUser: This mutation resolver inserts a new user record into the connected Postgres database.

      • UploadProfileImage: This mutation resolver uploads a media file received from a multipart HTTP request and uploads the file to a bucket within DigitalOcean Spaces. After the file upload, the URL of the uploaded file is inserted into the img_uri field of the previously created user.

      • Users: This query resolver queries the database for all existing users and returns them as the query result.

      Going through the methods generated from the Mutation and Query types, you would observe that they cause a panic with a not implemented error when executed. This indicates that they are still auto-generated boilerplate code. Later in this tutorial, you will return to the schema.resolver.go file to implement these generated methods.

      At this point, you generated the resolvers for this application based on the content of the schema.graphqls file. You will now use the Managed Databases service to create a database that will store the data passed to the mutation resolvers to create a user.

      Step 4 — Provisioning and Using a Managed Database Instance on DigitalOcean

      In this step, you will use the DigitalOcean console to access the Managed Databases service and create a PostgreSQL database to store data from this application. After the database has been created, you will securely store the details in a .env file.

      Although the application will not store images directly in a database, it still needs a database to insert each user‘s record. The stored record will then contain links to the uploaded files.

      A user’s record will consist of a Fullname, email, dateCreated, and an img_uri field of String data type. The img_uri field contains the URL pointing to an image file uploaded by a user through this GraphQL API and stored within a bucket on DigitalOcean Spaces.

      Using your DigitalOcean dashboard, navigate to the Databases section of the console to create a new database cluster, and select PostgreSQL from the list of databases offered. Leave all other settings at their default values and create this cluster using the button at the bottom.

      Digitalocean database cluster

      The database cluster creation process will take a few minutes before it is completed.

      After creating the cluster, follow the Getting Started steps on the database cluster page to set up the cluster for use.

      At the second step of the Getting Started guide, click the Continue, I’ll do this later text to proceed. By default, the database cluster is open to all connections.

      Note: In a production-ready scenario, the Add Trusted Sources input field at the second step should only contain trusted IP addresses, such as the IP Address of the DigitalOcean Droplet running the application. During development, you can alternatively add the IP address of your development machine to the Add Trusted Sources input field.

      Click the Allow these inbound sources button to save and proceed to the next step.

      At the next step, the connection details of the cluster are displayed. You can also find the cluster credentials by clicking the Actions dropdown, then selecting the Connection details option.

      Digitalocean database cluster credentials

      In this screenshot, the gray box at right shows the connection credentials of the created demo cluster.

      You will securely store these cluster credentials as environment variables. In the digitalocean project directory, create a .env file and add your cluster credentials in the following format, making sure to replace the highlighted placeholder content with your own credentials:

      .env

      
       DB_PASSWORD=YOUR_DB_PASSWORD
       DB_PORT=PORT
       DB_NAME=YOUR_DATABASE_NAME
       DB_ADDR=HOST
       DB_USER=USERNAME
      

      With the connection details securely stored in the .env file, the next step will be to retrieve these credentials and connect the database cluster to your project.

      Before proceeding, you will need a database driver to work with Golang’s native SQL package when connecting to the Postgres database. go-pg is a Golang library for translating ORM (object-relational mapping) queries into SQL Queries for a Postgres database. godotenv is a Golang library for loading environment credential from a .env file into your application. Lastly, go.uuid generates a UUID (universally unique identifier) for each user’s record that will be inserted into the database.

      Execute this command to install these:

      • go get github.com/go-pg/pg/v10 github.com/joho/godotenv github.com/satori/go.uuid

      Next, navigate to the graph directory and create a db.go file. You will gradually put together the code within the file to connect with the Postgres database created in the Managed Databases cluster.

      First, add the content of the code block into the db.go file. This function (createSchema) creates a user table in the Postgres database immediately after a connection to the database has been established.

      db.go

      package graph
      
      import (
          "github.com/go-pg/pg/v10"
          "github.com/go-pg/pg/v10/orm"
          "digitalocean/graph/model"
      )
      
      func createSchema(db *pg.DB) error {
          for _, models := range []interface{}{(*model.User)(nil)}{
              if err := db.Model(models).CreateTable(&orm.CreateTableOptions{
                  IfNotExists: true,
              }); err != nil {
                  panic(err)
              }
          }
      
          return nil
      }
      

      Using the IfNotExists option passed to the CreateTable method from go-pg, the createSchema function only inserts a new table into the database if the table does not exist. You can understand this process as a simplified form of seeding a newly created database. Rather than creating the Tables manually through the psql client or GUI, the createSchema function takes care of the table creation.

      Next, add the content of the code block below into the db.go file to establish a connection to the Postgres database and execute the createSchema function above when a connection has been established successfully:

      db.go

      
      import (
            // ...
      
               "fmt" 
               "os" 
          )
      
      func Connect() *pg.DB {
          DB_PASSWORD := os.Getenv("DB_PASSWORD")
          DB_PORT := os.Getenv("DB_PORT")
          DB_NAME := os.Getenv("DB_NAME")
          DB_ADDR := os.Getenv("DB_ADDR")
          DB_USER := os.Getenv("DB_USER")
      
          connStr := fmt.Sprintf(
              "postgresql://%v:%v@%v:%v/%v?sslmode=require",
              DB_USER, DB_PASSWORD, DB_ADDR, DB_PORT, DB_NAME )
      
          opt, err := pg.ParseURL(connStr); if err != nil {
            panic(err)
            }
      
          db := pg.Connect(opt)
      
          if schemaErr := createSchema(db); schemaErr != nil {
              panic(schemaErr)
          }
      
          if _, DBStatus := db.Exec("SELECT 1"); DBStatus != nil {
              panic("PostgreSQL is down")
          }
      
          return db 
      }
      

      When executed, the exported Connect function in the code block above establishes a connection to a Postgres database using go-pg. This is done through the following operations:

      • First, the database credentials you stored in the root .env file are retrieved. Then, a variable is created to store a string formatted with the retrieved credentials. This variable will be used as a connection URI when connecting with the database.

      • Next, the created connection string is parsed to see if the formatted credentials are valid. If valid, the connection string is passed into the connect method as an argument to establish a connection.

      To use the exported Connect function, you will need to add the function to the server.go file, so it will be executed when the application is started. Then the connection can be stored in the DB field within the Resolver struct.

      To use the previously created Connect function from the graph package immediately after the application is started, and to load the credentials from the .env file into the application, open the server.go file in your preferred code editor and add the lines highlighted below:

      Note: Make sure to replace the existing srv variable in the server.go file with the srv variable highlighted below.

      server.go

       package main
      
      import (
        "log"
        "net/http"
        "os"
        "digitalocean/graph"
        "digitalocean/graph/generated"
      
        "github.com/99designs/gqlgen/graphql/handler"
        "github.com/99designs/gqlgen/graphql/playground"
       "github.com/joho/godotenv"
      )
      
      const defaultPort = "8080"
      
      func main() {
           err := godotenv.Load(); if err != nil {
           log.Fatal("Error loading .env file")
          } 
      
        // ...
      
           Database := graph.Connect()
           srv := handler.NewDefaultServer(
                   generated.NewExecutableSchema(
                           generated.Config{
                               Resolvers: &graph.Resolver{
                                   DB: Database,
                               },
                           }),
               )
      
        // ...
      }
      

      In this code snippet, you loaded the credentials stored in the .env through the Load() function. You called the Connect function from the db package and also created the Resolver object with the database connection stored in the DB field. (The stored database connection will be accessed by the resolvers later in this tutorial.)

      Currently, the boilerplate Resolver struct in the resolver.go file does not contain the DB field where you stored the database connection in the code above. You will need to create the DB field.

      In the graph directory, open the resolver.go file and modify the Resolver struct to have a DB field with a go-pg pointer as its type, as shown below:

      resolver.go

      package graph
      
      import "github.com/go-pg/pg/v10"
      
      // This file will not be regenerated automatically.
      //
      // It serves as dependency injection for your app, add any dependencies you require here.
      
      type Resolver struct {
          DB *pg.DB
      }
      

      Now a database connection will be established each time the entry server.go file is run and the go-pg package can be used as an ORM to perform operations on the database from the resolver functions.

      In this step, you created a PostgreSQL database using the Managed Database service on DigitalOcean. You also created a db.go file with a Connect function to establish a connection to the PostgreSQL database when the application is started. Next, you will implement the generated resolvers to store data in the PostgreSQL database.

      Step 5 — Implementing the Generated Resolvers

      In this step, you will implement the methods in the schema.resolvers.go file, which serves as the mutation and query resolvers. The implemented mutation resolvers will create a user and upload the user’s profile image, while the query resolver will retrieve all stored user details.

      Implementing the Mutation Resolver Methods

      In the schema.graphqls file, two mutation resolvers were generated. One with the purpose of inserting the user’s record, while the other handles the profile image uploads. However, these mutations have not yet been implemented as they are boilerplate code.

      Open the schema.resolvers.go file. Modify the imports and the CreateUser mutation with the highlighted lines to insert a new row containing the user details input into the database:

      schema.resolvers.go

      package graph
      
      import (
        "context"
        "fmt"
         "time" 
      
        "digitalocean/graph/generated"
        "digitalocean/graph/model"
        "github.com/satori/go.uuid" 
      )
      
      func (r *mutationResolver) CreateUser(ctx context.Context, input model.NewUser) (*model.User, error) {
           user := model.User{ 
               ID:          fmt.Sprintf("%v", uuid.NewV4()), 
               FullName:    input.FullName, 
               Email:       input.Email, 
               ImgURI:      "https://bit.ly/3mCSn2i", 
               DateCreated: time.Now().Format("01-02-2006"), 
           } 
      
           _, err := r.DB.Model(&user).Insert(); if err != nil { 
               return nil, fmt.Errorf("error inserting user: %v", err) 
           } 
      
           return &user, nil 
      }
      
      

      In the CreateUser mutation, there are two things to note about the user rows inserted. First, each row that is inserted is given a UUID. Second, the ImgURI field in each row has a placeholder image URL as the default value. This will be the default value for all records and will be updated when a user uploads a new image.

      Next, you will test the application that has been built at this point. From the project directory, run the server.go file with the following command:

      Now, navigate to http://localhost:8080 through your web browser to access the GraphQL playground built-in to your GraphQL API. Paste the GraphQL Mutation in the code block below into the playground editor to insert a new user record.

      graphql

      
      mutation createUser {
        createUser(
          input: {
            email: "johndoe@gmail.com"
            fullName: "John Doe"
          }
        ) {
          id
        }
      }
      

      The output in the right pane will look similar to this:

      A create user mutation on the GraphQL Playround

      You executed the CreateUser mutation to create a test user with the name of John Doe, and the id of the newly inserted user record was returned as a result of the mutation.

      Note: Copy the id value returned from the executed GraphQL query. You will use the id when uploading a profile image for the test user created above.

      At this point, you have the second UploadProfileImage mutation resolver function left to implement. But before you implement this function, you need to implement the query resolver first. This is because each upload is linked to a specific user, which is why you retrieved the ID of a specific user before uploading an image.

      Implementing the Query Resolver Method

      As defined in the schema.resolvers.graphqls file, one query resolver was generated to retrieve all created users. Similar to the previous mutation resolvers methods, you also need to implement the query resolver method.

      Open scheme.resolvers.go and modify the generated Users query resolver with the highlighted lines. The new code within the Users method below will query the Postgres database for all user rows and return the result.

      schema.resolvers.go

      package graph
      
      func (r *queryResolver) Users(ctx context.Context) ([]*model.User, error) {
        var users []*model.User
      
        err := r.DB.Model(&users).Select()
          if err != nil {
           return nil, err
          } 
      
        return users, nil 
      }
      

      Within the Users resolver function above, fetching all records within the user table is made possible by using go-pg’s select method on the User model without passing the WHERE or LIMIT clause into the query.

      Note: For a bigger application where many records will be returned from the query, it is important to consider paginating the data returned for improved performance.

      To test this query resolver from your browser, navigate to http://localhost:8080 to access the GraphQL playground. Paste the GraphQL Query below into the playground editor to fetch all created user records.

      graphql

      
      query fetchUsers {
        users {
            fullName
            id
            img_uri
        }
      }
      

      The output in the right pane will look similar to this:

      Query result GraphQL playground

      In the returned results, you can see that a users object with an array value was returned. For now, only the previously created user was returned in the users array because that it is the only record in the table. More users will be returned in the users array if you execute the createUser mutation with new details. You can also observe that the img_uri field in the returned data has the hardcoded fallback image URL.

      At this point, you have now implemented both the CreateUser mutation and the User query. Everything is in place for you to receive images from the second UploadProfileImage resolver and upload the received image to a bucket with DigitalOcean Spaces using an S3 compatible AWS-GO SDK.

      Step 6 — Uploading Images to DigitalOcean Spaces

      In this step, you will use the powerful API within the second UploadProfileImage mutation to upload an image to your Space.

      To begin, navigate to the Spaces section of your DigitalOcean console, where you will create a new bucket for storing the uploaded files from your backend application.

      Click the Create New Space button. Leave the settings at their default values and specify a unique name for the new Space:

      Digitalocean spaces

      After a new Space has been created, navigate to the settings tab and copy the Space’s endpoint, name, and region. Add these to the .env file within the GraphQL project in this format:

      .env

      SPACE_ENDPOINT=BUCKET_ENDPOINT
      DO_SPACE_REGION=DO_SPACE_REGION
      DO_SPACE_NAME=DO_SPACE_NAME
      

      As an example, the following screenshot shows the Setting tab, and highlights the name, region, and endpoint details of the demo space (Victory-space):

      Victory-space endpoint, name, and region

      As part of the prerequisites, you created a Space Access key and Secret key for your Space. Paste in your Access and Secret keys into the .env file within the GraphQL application in the following format:

      .env

      ACCESS_KEY=YOUR_SPACE_ACCESS_KEY
      SECRET_KEY=YOUR_SPACE_SECRET_KEY
      

      At this point, you will need to use the CTRL + C key combination to stop the GraphQL server, and execute the command below to restart the GraphQL application with the new credentials loaded into the application.

      Now that your Space credentials are loaded into the application, you will create the upload logic in the UploadProfileImage mutation resolver. The first step will be to add and configure the aws-sdk-go SDK to connect to your DigitalOcean Space.

      One way to programmatically perform operations on your bucket within Spaces is through the use of compatible AWS SDKs. The AWS Go SDK is a development kit that provides a set of libraries to be used by Go developers. The libraries provided by the SDK can be used by a Go written application when performing operations with AWS resources such as file transfers to S3 buckets.

      The DigitalOcean Spaces documentation provides a list of operations you can perform on the Spaces API using an AWS SDK. We will use the aws-sdk-go SDK to connect to the your DigitalOcean Space.

      Execute the go get command to install the aws-sdk-go SDK into the application:

      • go get github.com/aws/aws-sdk-go

      Over the next few code blocks, you will gradually put together the upload logic in the UploadProfileImage mutation resolver.

      First, open the schema.resolvers.go file. Add the highlighted lines to configure the AWS SDK with the stored credentials and establish a connection with your DigitalOcean Space:

      Note: The code within the code block below is incomplete, as you are gradually putting the upload logic together. You will complete the code in the subsequent code blocks.

      schema.resolvers.go

      package graph
      
      import (
         ...
      
         "os"
      
         "github.com/aws/aws-sdk-go/aws"
         "github.com/aws/aws-sdk-go/aws/credentials"
         "github.com/aws/aws-sdk-go/aws/session"
         "github.com/aws/aws-sdk-go/service/s3"
      )
      
      func (r *mutationResolver) UploadProfileImage(ctx context.Context, input model.ProfileImage) (bool, error) {
      
       SpaceRegion := os.Getenv("DO_SPACE_REGION")
       accessKey := os.Getenv("ACCESS_KEY")
       secretKey := os.Getenv("SECRET_KEY")
      
       s3Config := &aws.Config{
           Credentials: credentials.NewStaticCredentials(accessKey, secretKey, ""),
           Endpoint:    aws.String(os.Getenv("SPACE_ENDPOINT")),
           Region:      aws.String(SpaceRegion),
       }
      
       newSession := session.New(s3Config)
       s3Client := s3.New(newSession)
      
      }
      

      Now that the SDK is configured, the next step is to upload the file sent in the multipart HTTP request.

      One way to handle files sent is to read the content from the multipart request, temporarily save the content to a new file in memory, upload the temporary file using the aws-SDK-go library, and then delete it after an upload. Using this approach, a client application such as a web application consuming this GraphQL API still uses the same GraphQL endpoint to perform file uploads, rather than using a third party API to upload files.

      To achieve this, add the highlighted lines to the existing code within the UploadProfileImage mutation resolver in the schema.resolvers.go file:

      schema.resolvers.go

      
      package graph
      
      import (
         ...
      
         "io/ioutil"
         "bytes"
      
      )
      
      func (r *mutationResolver) UploadProfileImage(ctx context.Context, input model.ProfileImage) (bool, error) {
      ...
      
      SpaceName := os.Getenv("DO_SPACE_NAME")
      
      ...
      
      
        userFileName := fmt.Sprintf("%v-%v", input.UserID, input.File.Filename)
        stream, readErr := ioutil.ReadAll(input.File.File)
       if readErr != nil {
           fmt.Printf("error from file %v", readErr)
       }
      
       fileErr := ioutil.WriteFile(userFileName, stream, 0644); if fileErr != nil {
           fmt.Printf("file err %v", fileErr)
       }
      
       file, openErr := os.Open(userFileName); if openErr != nil {
           fmt.Printf("Error opening file: %v", openErr)
       }
      
       defer file.Close()
      
       buffer := make([]byte, input.File.Size)
      
      _, _ = file.Read(buffer)
      
       fileBytes := bytes.NewReader(buffer)
      
       object := s3.PutObjectInput{
           Bucket: aws.String(SpaceName),
           Key:    aws.String(userFileName),
           Body:   fileBytes,
           ACL:    aws.String("public-read"),
       }
      
       if _, uploadErr := s3Client.PutObject(&object); uploadErr != nil {
           return false, fmt.Errorf("error uploading file: %v", uploadErr)
       }
      
       _ = os.Remove(userFileName)
      
      
      return true, nil
      }
      

      Using the ReadAll method from the io package in the code block above, you first read the content of the file added to the multipart request sent to the GraphQL API, and then a temporary file is created to dump this content into.

      Next, using the PutObjectInput struct, you created the structure of the file to be uploaded by specifying the Bucket, Key, ACL, and Body field to be the content of the temporarily stored file.

      Note: The Access Control List (ACL) field in the PutObjectInput struct has a public-read value to make all uploaded files available for viewing over the internet. You can remove this field if your application requires that uploaded data be kept private.

      After creating the PutObjectInput struct, the PutObject method is used to make a PUT operation, sending the values of the PutObjectInput struct to the bucket. If there is an error, a false boolean value and an error message are returned, ending the execution of the resolver function and the mutation in general.

      To test the upload mutation resolver, you can use an image of Sammy the Shark, DigitalOcean’s mascot. Use the wget command to download an image of Sammy:

      • wget https://html.sammy-codes.com/images/small-profile.jpeg

      Next, execute the cURL command below to make an HTTP request to the GraphQL API to upload Sammy’s image, which has been added to the request form body.

      Note: If you are on a Windows Operating System, it is recommended that you execute the cURL commands using the Git Bash shell due to the backslash escapes.

      • curl localhost:8080/query -F operations="{ "query": "mutation uploadProfileImage($image: Upload! $userId : String!) { uploadProfileImage(input: { file: $image userId : $userId}) }", "variables": { "image": null, "userId" : "12345" } }" -F map='{ "0": ["variables.image"] }' -F 0=@small-profile.jpeg

      Note: We are using a random userId value in the request above because the process of updating a user’s record has not yet been implemented.

      The output will look similar to this, indicating that the file upload was successful:

      Output

      {"data": { "uploadProfileImage": true }}

      In the Spaces section of the DigitalOcean console, you will find the image uploaded from your terminal:

      A bucket within Digitalocean showing a list of uploaded files

      At this point, file uploads within the application are working; however, the files are linked to the user who performed the upload. The goal of each file upload is to have the file uploaded into a storage bucket and then linked back to a user by updating the img_uri field of the user.

      Open the resolver.go file in the graph directory and add the code block below. It contains two methods: one to retrieve a user from the database by a specified field, and the other function to update the record of a user.

      resolver.go

      
      import (
      ...
      
        "digitalocean/graph/model"
        "fmt"
      )
      
      ...
      
      func (r *mutationResolver) GetUserByField(field, value string) (*model.User, error) {
          user := model.User{}
      
          err := r.DB.Model(&user).Where(fmt.Sprintf("%v = ?", field), value).First()
      
          return &user, err
      }
      
      
      func (r *mutationResolver) UpdateUser(user *model.User) (*model.User, error) {
          _, err := r.DB.Model(user).Where("id = ?", user.ID).Update()
          return user, err
      }
      
      

      The first GetUserByField function above accepts a field and value argument, both of a string type. Using go-pg’s ORM, it executes a query on the database, fetching data from the user table with a WHERE clause.

      The second UpdateUser function in the code block uses go-pg to execute an UPDATE statement to update a record in the user table. Using the where method, a WHERE clause with a condition is added to the UPDATE statement to update only the row having the same ID passed into the function.

      Now you can use the two methods in the UploadProfileImage mutation. Add the content of the highlighted code block below to the UploadProfileImage mutation within the schema.resolvers.go file. This will retrieve a specific row from the user table and update the img_uri field in the user’s record after the file has been uploaded.

      Note: Place the highlighted code at the line above the existing return statement within the UploadProfileImage mutation.

      schema.resolvers.go

      
      package graph
      
      
      func (r *mutationResolver) UploadProfileImage(ctx context.Context, input model.ProfileImage) (bool, error) {
        _ = os.Remove(userFileName)
      
       
          user, userErr := r.GetUserByField("ID", *input.UserID)
        
           if userErr != nil {
               return false, fmt.Errorf("error getting user: %v", userErr)
           }
        
         fileUrl := fmt.Sprintf("https://%v.%v.digitaloceanspaces.com/%v", SpaceName, SpaceRegion, userFileName)
        
           user.ImgURI = fileUrl
        
           if _, err := r.UpdateUser(user); err != nil {
               return false, fmt.Errorf("err updating user: %v", err)
           }
        
      
        return true, nil
      }
      

      From the new code added to the schema.resolvers.go file, an ID string and the user’s ID are passed to the GetUserByField helper function to retrieve the record of the user executing the mutation.

      A new variable is then created and given the value of a string formatted to have the link of the recently uploaded file in the format of https://BUCKET_NAME.SPACE_REGION.digitaloceanspaces.com/USER_ID-FILE_NAME. The ImgURI field in the retrieved user model was reassigned the value of the formatted string as a link to the uploaded file.

      Paste the curl command below into your terminal, and replace the highlighted USER_ID placeholder in the command with the userId of the user created through the GraphQL playground in a previous step. Make sure the userId is wrapped in quotation marks so that the terminal can encode the value properly.

      • curl localhost:8080/query -F operations="{ "query": "mutation uploadProfileImage($image: Upload! $userId : String!) { uploadProfileImage(input: { file: $image userId : $userId}) }", "variables": { "image": null, "userId" : "USER_ID" } }" -F map='{ "0": ["variables.image"] }' -F 0=@small-profile.jpeg

      The output will look similar to this:

      Output

      {"data": { "uploadProfileImage": true }}

      To further confirm that the user’s img_uri was updated, you can use the fetchUsers query from the GraphQL playground in the browser to retrieve the user’s details. If the update was successful, you will see that the default placeholder URL of https://bit.ly/3mCSn2i in the img_uri field has been updated to the value of the uploaded image.

      The output in the right pane will look similar to this:

      A query mutation to retrieve an updated user record using the GraphQL Playground

      In the returned results, the img_uri in the first user object returned from the query has a value that corresponds to a file upload to a bucket within DigitalOcean Spaces. The link in the img_uri field is made up of the bucket endpoint, the user’s ID, and lastly, the filename.

      To test the permission of the uploaded file set through the ACL option, you can open the img_uri link in your browser. Due to the default Metadata on the uploaded image, it will automatically download to your computer as an image file. You can open the file to view the image.

      Downloaded view of the uploaded file

      The image at the img_uri link will be the same image that was uploaded from the command line, indicating that the methods in the resolver.go file were executed correctly, and the entire file upload logic in the UploadProfileImage mutation works as expected.

      In this step, you uploaded an image into a DigitalOcean Space by using the AWS SDK for Go from the UploadProfileImage mutation resolver.

      Conclusion

      In this tutorial, you performed a file upload to a created bucket on a DigitalOcean Space using the AWS SDK for Golang from a mutation resolver in a GraphQL application.

      As a next step, you could deploy the application built within this tutorial. The Go Dev Guide provides a beginner-friendly guide on how to deploy a Golang application to DigitalOcean’s App Platform, which is a fully managed solution for building, deploying, and managing your applications from various programming languages.



      Source link

      How To Work With Zip Files in Node.js


      The author selected Open Sourcing Mental Illness to receive a donation as part of the Write for DOnations program.

      Introduction

      Working with files is one of the common tasks among developers. As your files grow in size, they start taking significant space on your hard drive. Sooner or later you may need to transfer the files to other servers or upload multiple files from your local machine to different platforms. Some of these platforms have file size limits, and won’t accept large files. To get around this, you can group the files into a single ZIP file. A ZIP file is an archive format that packs and compresses files with the lossless compression algorithm. The algorithm can reconstruct the data without any data loss. In Node.js, you can use the adm-zip module to create and read ZIP archives.

      In this tutorial, you will use adm-zip module to compress, read, and decompress files. First, you’ll combine multiple files into a ZIP archive using adm-zip. You’ll then list the ZIP archive contents. After that, you’ll add a file to an existing ZIP archive, and then finally, you’ll extract a ZIP archive into a directory.

      Prerequisites

      To follow this tutorial, you’ll need:

      Step 1 — Setting Up the Project

      In this step, you’ll create the directory for your project and install adm-zip as a dependency. This directory is where you’ll keep your program files. You’ll also create another directory containing text files and an image. You’ll archive this directory in the next section.

      Create a directory called zip_app with the following command:

      Navigate into the newly created directory with the cd command:

      Inside the directory, create a package.json file to manage the project dependencies:

      The -y option creates a default package.json file.

      Next, install adm-zip with the npm install command:

      After you run the command, npm will install adm-zip and update the package.json file.

      Next, create a directory called test and move into it:

      In this directory, you will create three text files and download an image. The three files will be filled with dummy content to make their file sizes larger. This will help to demonstrate ZIP compression when you archive this directory.

      Create the file1.txt and fill it with dummy content using the following command:

      • yes "dummy content" | head -n 100000 > file1.txt

      The yes command logs the string dummy content repeatedly. Using the pipe command |, you send the output from the yes command to be used as input for the head command. The head command prints part of the given input into the standard output. The -n option specifies the number of lines that should be written to the standard output. Finally, you redirect the head output to a new file file1.txt using >.

      Create a second file with the string “dummy content” repeated 300,000 lines:

      • yes "dummy content" | head -n 300000 > file2.txt

      Create another file with the dummy content string repeated 600,000 lines:

      • yes "dummy content" | head -n 600000 > file3.txt

      Finally, download an image into the directory using curl:

      • curl -O https://assets.digitalocean.com/how-to-process-images-in-node-js-with-sharp/underwater.png

      Move back into the main project directory with the following command:

      The .. will move you to the parent directory, which is zip_app.

      You’ve now created the project directory, installed adm-zip, and created a directory with files for archiving. In the next step, you’ll archive a directory using the adm-zip module.

      Step 2 — Creating a ZIP Archive

      In this step, you’ll use adm-zip to compress and archive the directory you created in the previous section.

      To archive the directory, you’ll import the adm-zip module and use the module’s addLocalFolder() method to add the directory to the adm-zip module’s ZIP object. Afterward, you’ll use the module’s writeZip() method to save the archive in your local system.

      Create and open a new file createArchive.js in your preferred text editor. This tutorial uses nano, a command-line text editor:

      Next, require in the adm-zip module in your createArchive.js file:

      zip_app/createArchive.js

      const AdmZip = require("adm-zip");
      

      The adm-zip module provides a class that contains methods for creating ZIP archives.

      Since it’s common to encounter large files during the archiving process, you might end up blocking the main thread until the ZIP archive is saved. To write non-blocking code, you’ll define an asynchronous function to create and save a ZIP archive.

      In your createArchive.js file, add the following highlighted code:

      zip_app/createArchive.js

      
      const AdmZip = require("adm-zip");
      
      async function createZipArchive() {
        const zip = new AdmZip();
        const outputFile = "test.zip";
        zip.addLocalFolder("./test");
        zip.writeZip(outputFile);
        console.log(`Created ${outputFile} successfully`);
      }
      
      createZipArchive();
      

      createZipArchive is an asynchronous function that creates a ZIP archive from a given directory. What makes it asynchronous is the async keyword you defined before the function label. Within the function, you create an instance of the adm-zip module, which provides methods you can use for reading and creating archives. When you create an instance, adm-zip creates an in-memory ZIP where you can add files or directories.

      Next, you define the archive name and store it in the outputDir variable. To add the test directory to the in-memory archive, you invoke the addLocalFolder() method from adm-zip with the directory path as an argument.

      After the directory is added, you invoke the writeZip() method from adm-zip with a variable containing the name of the ZIP archive. The writeZip() method saves the archive to your local disk.

      Once that’s done, you invoke console.log() to log that the ZIP file has been created successfully.

      Finally, you call the createZipArchive() function.

      Before you run the file, wrap the code in a try…catch block to handle runtime errors:

      zip_app/createArchive.js

      const AdmZip = require("adm-zip");
      
      async function createZipArchive() {
        try {
          const zip = new AdmZip();
          const outputFile = "test.zip";
          zip.addLocalFolder("./test");
          zip.writeZip(outputFile);
          console.log(`Created ${outputFile} successfully`);
        } catch (e) {
          console.log(`Something went wrong. ${e}`);
        }
      }
      
      createZipArchive();
      

      Within the try block, the code will attempt to create a ZIP archive. If successful, the createZipArchive() function will exit, skipping the catch block. If creating a ZIP archive triggers an error, execution will skip to the catch block and log the error in the console.

      Save and exit the file in nano with CTRL+X. Enter y to save the changes, and confirm the file by pressing ENTER on Windows, or the RETURN key on the Mac.

      Run the createArchive.js file using the node command:

      You’ll receive the following output:

      Output

      Created test.zip successfully

      List the directory contents to see if the ZIP archive has been created:

      You’ll receive the following output showing the archive among the contents:

      Output

      createArchive.js node_modules package-lock.json package.json test test.zip

      With the confirmation that the ZIP archive has been created, you’ll compare the ZIP archive, and the test directory file size to see if the compression works.

      Check the test directory size using the du command:

      The -h flag instructs du to show the directory size in a human-readable format.

      After running the command, you will receive the following output:

      Output

      15M test

      Next, check the test.zip archive file size:

      The du command logs the following output:

      Output

      760K test.zip

      As you can see, creating the ZIP file has dropped the directory size from 15 Megabytes(MB) to 760 Kilobytes(KB), which is a huge difference. The ZIP file is more portable and smaller in size.

      Now that you created a ZIP archive, you’re ready to list the contents in a ZIP file.

      Step 3 — Listing Files in a ZIP Archive

      In this step, you’ll read and list all files in a ZIP archive using adm-zip. To do that, you’ll instantiate the adm-zip module with your ZIP archive path. You’ll then call the module’s getEntries() method which returns an array of objects. Each object holds important information about an item in the ZIP archive. To list the files, you’ll iterate over the array and access the filename from the object, and log it in the console.

      Create and open readArchive.js in your favorite text editor:

      In your readArchive.js, add the following code to read and list contents in a ZIP archive:

      zip_app/readArchive.js

      const AdmZip = require("adm-zip");
      
      async function readZipArchive(filepath) {
        try {
          const zip = new AdmZip(filepath);
      
          for (const zipEntry of zip.getEntries()) {
            console.log(zipEntry.toString());
          }
        } catch (e) {
          console.log(`Something went wrong. ${e}`);
        }
      }
      
      readZipArchive("./test.zip");
      

      First, you require in the adm-zip module.

      Next, you define the readZipArchive() function, which is an asynchronous function. Within the function, you create an instance of adm-zip with the path of the ZIP file you want to read. The file path is provided by the filepath parameter. adm-zip will read the file and parse it.

      After reading the archive, you define a for....of statement that iterates over objects in an array that the getEntries() method from adm-zip returns when invoked. On each iteration, the object is assigned to the zipEntry variable. Inside the loop, you convert the object into a string that represents the object using the Node.js toString() method, then log it in the console using the console.log() method.

      Finally, you invoke the readZipArchive() function with the ZIP archive file path as an argument.

      Save and exit your file, then run the file with the following command:

      You will get output that resembles the following(edited for brevity):

      Output

      { "entryName": "file1.txt", "name": "file1.txt", "comment": "", "isDirectory": false, "header": { ... }, "compressedData": "<27547 bytes buffer>", "data": "<null>" } ...

      The console will log four objects. The other objects have been edited out to keep the tutorial brief.

      Each file in the archive is represented with an object similar to the one in the preceding output. To get the filename for each file, you need to access the name property.

      In your readArchive.js file, add the following highlighted code to access each filename:

      zip_app/readArchive.js

      const AdmZip = require("adm-zip");
      
      async function readZipArchive(filepath) {
        try {
          const zip = new AdmZip(filepath);
      
          for (const zipEntry of zip.getEntries()) {
            console.log(zipEntry.name);
          }
        } catch (e) {
          console.log(`Something went wrong. ${e}`);
        }
      }
      
      readZipArchive("./test.zip");
      

      Save and exit your text editor. Now, run the file again with the node command:

      Running the file results in the following output:

      Output

      file1.txt file2.txt file3.txt underwater.png

      The output now logs the filename of each file in the ZIP archive.

      You can now read and list each file in a ZIP archive. In the next section, you’ll add a file to an existing ZIP archive.

      Step 4 — Adding a File to an Existing Archive

      In this step, you’ll create a file and add it to the ZIP archive you created earlier without extracting it. First, you’ll read the ZIP archive by creating an adm-zip instance. Second, you’ll invoke the module’s addFile() method to add the file in the ZIP. Finally, you’ll save the ZIP archive in the local system.

      Create another file file4.txt with dummy content repeated 600,000 lines:

      • yes "dummy content" | head -n 600000 > file4.txt

      Create and open updateArchive.js in your text editor:

      Require in the adm-zip module and the fs module that allows you to work with files in your updateArchive.js file:

      const AdmZip = require("adm-zip");
      const fs = require("fs").promises;
      

      You require in the promise-based version of the fs module version, which allows you to write asynchronous code. When you invoke an fs method, it will return a promise.

      Next in your updateArchive.js file, add the following highlighted code to add a new file to the ZIP archive:

      zip_app/updateArchive.js

      const AdmZip = require("adm-zip");
      const fs = require("fs").promises;
      
      async function updateZipArchive(filepath) {
        try {
          const zip = new AdmZip(filepath);
      
          content = await fs.readFile("./file4.txt");
          zip.addFile("file4.txt", content);
          zip.writeZip(filepath);
          console.log(`Updated ${filepath} successfully`);
        } catch (e) {
          console.log(`Something went wrong. ${e}`);
        }
      }
      
      updateZipArchive("./test.zip");
      

      updateZipArchive is an asynchronous function that reads a file in the filesystem and adds it to an existing ZIP. In the function, you create an instance of adm-zip with the ZIP archive file path in the filepath as a parameter. Next, you invoke the fs module’s readFile() method to read the file in the file system. The readFile() method returns a promise, which you resolve with the await keyword (await is valid in only asynchronous functions). Once resolved, the method returns a buffer object, which contains the file contents.

      Next, you invoke the addFile() method from adm-zip. The method takes two arguments. The first argument is the filename you want to add to the archive, and the second argument is the buffer object containing the contents of the file that the readFile() method reads.

      Afterwards, you invoke adm-zip module’s writeZip() method to save and write new changes in the ZIP archive. Once that’s done, you call the console.log() method to log a success message.

      Finally, you invoke the updateZipArchive() function with the Zip archive file path as an argument.

      Save and exit your file. Run the updateArchive.js file with the following command:

      You’ll see output like this:

      Output

      Updated ./test.zip successfully

      Now, confirm that the ZIP archive contains the new file. Run the readArchive.js file to list the contents in the ZIP archive with the following command:

      You’ll receive the following output:

      file1.txt
      file2.txt
      file3.txt
      file4.txt
      underwater.png
      

      This confirms that the file has been added to the ZIP.

      Now that you can add a file to an existing archive, you’ll extract the archive in the next section.

      In this step, you’ll read and extract all contents in a ZIP archive into a directory. To extract a ZIP archive, you’ll instantiate adm-zip with the archive file path. After that, you’ll invoke the module’s extractAllTo() method with the directory name you want your extracted ZIP contents to reside.

      Create and open extractArchive.js in your text editor:

      Require in the adm-zip module and the path module in your extractArchive.js file:

      zip_app/extractArchive.js

      const AdmZip = require("adm-zip");
      const path = require("path");
      

      The path module provides helpful methods for dealing with file paths.

      Still in your extractArchive.js file, add the following highlighted code to extract an archive:

      zip_app/extractArchive.js

      const AdmZip = require("adm-zip");
      const path = require("path");
      
      async function extractArchive(filepath) {
        try {
          const zip = new AdmZip(filepath);
          const outputDir = `${path.parse(filepath).name}_extracted`;
          zip.extractAllTo(outputDir);
      
          console.log(`Extracted to "${outputDir}" successfully`);
        } catch (e) {
          console.log(`Something went wrong. ${e}`);
        }
      }
      
      extractArchive("./test.zip");
      

      extractArchive() is an asynchronous function that takes a parameter containing the file path of the ZIP archive. Within the function, you instantiate adm-zip with the ZIP archive file path provided by the filepath parameter.

      Next, you define a template literal. Inside the template literal placeholder (${}), you invoke the parse() method from the path module with the file path. The parse() method returns an object. To get the name of the ZIP file without the file extension, you append the name property to the object that the parse() method returns. Once the archive name is returned, the template literal interpolates the value with the _extracted string. The value is then stored in the outputDir variable. This will be the name of the extracted directory.

      Next, you invoke adm-zip module’s extractAllTo method with the directory name stored in the outputDir to extract the contents in the directory. After that, you invoke console.log() to log a success message.

      Finally, you call the extractArchive() function with the ZIP archive path.

      Save your file and exit the editor, then run the extractArchive.js file with the following command:

      You receive the following output:

      Output

      Extracted to "test_extracted" successfully

      Confirm that the directory containing the ZIP contents has been created:

      You will receive the following output:

      Output

      createArchive.js file4.txt package-lock.json readArchive.js test.zip updateArchive.js extractArchive.js node_modules package.json test test_extracted

      Now, navigate into the directory containing the extracted contents:

      List the contents in the directory:

      You will receive the following output:

      Output

      file1.txt file2.txt file3.txt file4.txt underwater.png

      You can now see that the directory has all the files that were in the original directory.

      You’ve now extracted the ZIP archive contents into a directory.

      Conclusion

      In this tutorial, you created a ZIP archive, listed its contents, added a new file to the archive, and extracted all of its content into a directory using adm-zip module. This will serve as a good foundation for working with ZIP archives in Node.js.

      To learn more about adm-zip module, view the adm-zip documentation. To continue building your Node.js knowledge, see How To Code in Node.js series



      Source link

      How To Work with Files Using Streams in Node.js


      The author selected Girls Who Code to receive a donation as part of the Write for DOnations program.

      Introduction

      The concept of streams in computing usually describes the delivery of data in a steady, continuous flow. You can use streams for reading from or writing to a source continuously, thus eliminating the need to fit all the data in memory at once.

      Using streams provides two major advantages. One is that you can use your memory efficiently since you do not have to load all the data into memory before you can begin processing. Another advantage is that using streams is time-efficient. You can start processing data almost immediately instead of waiting for the entire payload. These advantages make streams a suitable tool for large data transfer in I/O operations. Files are a collection of bytes that contain some data. Since files are a common data source in Node.js, streams can provide an efficient way to work with files in Node.js.

      Node.js provides a streaming API in the stream module, a core Node.js module, for working with streams. All Node.js streams are an instance of the EventEmitter class (for more on this, see Using Event Emitters in Node.js). They emit different events you can listen for at various intervals during the data transmission process. The native stream module provides an interface consisting of different functions for listening to those events that you can use to read and write data, manage the transmission life cycle, and handle transmission errors.

      There are four different kinds of streams in Node.js. They are:

      • Readable streams: streams you can read data from.
      • Writable streams: streams you can write data to.
      • Duplex streams: streams you can read from and write to (usually simultaneously).
      • Transform streams: a duplex stream in which the output (or writable stream) is dependent on the modification of the input (or readable stream).

      The file system module (fs) is a native Node.js module for manipulating files and navigating the local file system in general. It provides several methods for doing this. Two of these methods implement the streaming API. They provide an interface for reading and writing files using streams. Using these two methods, you can create readable and writable file streams.

      In this article, you will read from and write to a file using the fs.createReadStream and fs.createWriteStream functions. You will also use the output of one stream as the input of another and implement a custom transform steam. By performing these actions, you will learn to use streams to work with files in Node.js. To demonstrate these concepts, you will write a command-line program with commands that replicate the cat functionality found in Linux-based systems, write input from a terminal to a file, copy files, and transform the content of a file.

      Prerequisites

      To complete this tutorial, you will need:

      Step 1 — Setting up a File Handling Command-Line Program

      In this step, you will write a command-line program with basic commands. This command-line program will demonstrate the concepts you’ll learn later in the tutorial, where you’ll use these commands with the functions you’ll create to work with files.

      To begin, create a folder to contain all your files for this program. In your terminal, create a folder named node-file-streams:

      Using the cd command, change your working directory to the new folder:

      Next, create and open a file called mycliprogram in your favorite text editor. This tutorial uses GNU nano, a terminal text editor. To use nano to create and open your file, type the following command:

      In your text editor, add the following code to specify the shebang, store the array of command-line arguments from the Node.js process, and store the list of commands the application should have.

      node-file-streams/mycliprogram

      #!/usr/bin/env node
      
      const args = process.argv;
      const commands = ['read', 'write', 'copy', 'reverse'];
      

      The first line contains a shebang, which is a path to the program interpreter. Adding this line tells the program loader to parse this program using Node.js.

      When you run a Node.js script on the command-line, several command-line arguments are passed when the Node.js process runs. You can access these arguments using the argv property or the Node.js process. The argv property is an array that contains the command-line arguments passed to a Node.js script. In the second line, you assign that property to a variable called args.

      Next, create a getHelpText function to display a manual of how to use the program. Add the code below to your mycliprogram file:

      node-file-streams/mycliprogram

      ...
      const getHelpText = function() {
          const helpText = `
          simplecli is a simple cli program to demonstrate how to handle files using streams.
          usage:
              mycliprogram <command> <path_to_file>
      
              <command> can be:
              read: Print a file's contents to the terminal
              write: Write a message from the terminal to a file
              copy: Create a copy of a file in the current directory
              reverse: Reverse the content of a file and save its output to another file.
      
              <path_to_file> is the path to the file you want to work with.
          `;
          console.log(helpText);
      }
      

      The getHelpText function prints out the multi-line string you created as the help text for the program. The help text shows the command-line arguments or parameters that the program expects.

      Next, you’ll add the control logic to check the length of args and provide the appropriate response:

      node-file-streams/mycliprogram

      ...
      let command = '';
      
      if(args.length < 3) {
          getHelpText();
          return;
      }
      else if(args.length > 4) {
          console.log('More arguments provided than expected');
          getHelpText();
          return;
      }
      else {
          command = args[2]
          if(!args[3]) {
              console.log('This tool requires at least one path to a file');
              getHelpText();
              return;
          }
      }
      

      In the code snippet above, you have created an empty string command to store the command received from the terminal. The first if block checks whether the length of the args array is less than 3. If it is less than 3, it means that no other additional arguments were passed when running the program. In this case, it prints the help text to the terminal and terminates.

      The else if block checks to see if the length of the args array is greater than 4. If it is, then the program has received more arguments than it needs. The program will print a message to this effect along with the help text and terminate.

      Finally, in the else block, you store the third element or the element in the second index of the args array in the command variable. The code also checks whether there is a fourth element or an element with index = 3 in the args array. If the item does not exist, it prints a message to the terminal indicating that you need a file path to continue.

      Save the file. Then run the application:

      You might get a permission denied error similar to the output below:

      Output

      -bash: ./mycliprogram: Permission denied

      To fix this error, you will need to provide the file with execution permissions, which you can do with the following command:

      Re-run the file again. The output will look similar to this:

      Output

      simplecli is a simple cli program to demonstrate how to handle files using streams. usage: mycliprogram <command> <path_to_file> read: Print a file's contents to the terminal write: Write a message from the terminal to a file copy: Create a copy of a file in the current directory reverse: Reverse the content of a file and save it output to another file.

      Finally, you are going to partially implement the commands in the commands array you created earlier. Open the mycliprogram file and add the code below:

      node-file-streams/mycliprogram

      ...
      switch(commands.indexOf(command)) {
          case 0:
              console.log('command is read');
              break;
          case 1:
              console.log('command is write');
              break;
          case 2:
              console.log('command is copy');
              break;
          case 3:
              console.log('command is reverse');
              break;
          default:
              console.log('You entered a wrong command. See help text below for supported functions');
              getHelpText();
              return;
      }
      

      Any time you enter a command found in the switch statement, the program runs the appropriate case block for the command. For this partial implementation, you print the name of the command to the terminal. If the string is not in the list of commands you created above, the program will print out a message to that effect with the help text. Then the program will terminate.

      Save the file, then re-run the program with the read command and any file name:

      • ./mycliprogram read test.txt

      The output will look similar to this:

      Output

      command is read

      You have now successfully created a command-line program. In the following section, you will replicate the cat functionality as the read command in the application using createReadStream().

      Step 2 — Reading a File with createReadStream()

      The read command in the command-line application will read a file from the file system and print it out to the terminal similar to the cat command in a Linux-based terminal. In this section, you will implement that functionality using createReadStream() from the fs module.

      The createReadStream function creates a readable stream that emits events that you can listen to since it inherits from the EventsEmitter class. The data event is one of these events. Every time the readable stream reads a piece of data, it emits the data event, releasing a piece of data. When used with a callback function, it invokes the callback with that piece of data or chunk, and you can process that data within that callback function. In this case, you want to display that chunk in the terminal.

      To begin, add a text file to your working directory for easy access. In this section and some subsequent ones, you will be using a file called lorem-ipsum.txt. It is a text file containing ~1200 lines of lorem ipsum text generated using the Lorem Ipsum Generator, and it is hosted on GitHub. In your terminal, enter the following command to download the file to your working directory:

      • wget https://raw.githubusercontent.com/do-community/node-file-streams/999e66a11cd04bc59843a9c129da759c1c515faf/lorem-ipsum.txt

      To replicate the cat functionality in your command-line application, you’ll need to import the fs module because it contains the createReadStream function you need. To do this, open the mycliprogram file and add this line immediately after the shebang:

      node-file-streams/mycliprogram

      #!/usr/bin/env node
      
      const fs = require('fs');
      

      Next, you will create a function below the switch statement called read() with a single parameter: the file path for the file you want to read. This function will create a readable stream from that file and listen for the data event on that stream.

      node-file-streams/mycliprogram

      ...
      function read(filePath) {
          const readableStream = fs.createReadStream(filePath);
      
          readableStream.on('error', function (error) {
              console.log(`error: ${error.message}`);
          })
      
          readableStream.on('data', (chunk) => {
              console.log(chunk);
          })
      }
      

      The code also checks for errors by listening for the error event. When an error occurs, an error message will print to the terminal.

      Finally, you should replace console.log() with the read() function in the first case block case 0 as shown in the code block below:

      node-file-streams/mycliprogram

      ...
      switch (command){
          case 0:
              read(args[3]);
              break;
          ...
      }
      

      Save the file to persist the new changes and run the program:

      • ./mycliprogram read lorem-ipsum.txt

      The output will look similar to this:

      Output

      <Buffer 0a 0a 4c 6f 72 65 6d 20 69 70 73 75 6d 20 64 6f 6c 6f 72 20 73 69 74 20 61 6d 65 74 2c 20 63 6f 6e 73 65 63 74 65 74 75 72 20 61 64 69 70 69 73 63 69 ... > ... <Buffer 76 69 74 61 65 20 61 6e 74 65 20 66 61 63 69 6c 69 73 69 73 20 6d 61 78 69 6d 75 73 20 75 74 20 69 64 20 73 61 70 69 65 6e 2e 20 50 65 6c 6c 65 6e 74 ... >

      Based on the output above, you can see that the data was read in chunks or pieces, and these pieces of data are of the Buffer type. For the sake of brevity, the terminal output above shows only two chunks, and the ellipsis indicates that there are several buffers in between the chunks shown here. The larger the file, the greater the number of buffers or chunks.

      To return the data in a human-readable format, you will set the encoding type of the data by passing the string value of the encoding type you want as a second argument to the createReadStream() function. In the second argument to the createReadStream() function, add the following highlighted code to set the encoding type to utf8.

      node-file-streams/mycliprogram

      
      ...
      const readableStream = fs.createReadStream(filePath, 'utf8')
      ...
      

      Re-running the program will display the contents of the file in the terminal. The program prints the lorem ipsum text from the lorem-ipsum.txt file line by line as it appears in the file.

      Output

      Lorem ipsum dolor sit amet, consectetur adipiscing elit. Aenean est tortor, eleifend et enim vitae, mattis condimentum elit. In dictum ex turpis, ac rutrum libero tempus sed... ... ...Quisque nisi diam, viverra vel aliquam nec, aliquet ut nisi. Nullam convallis dictum nisi quis hendrerit. Maecenas venenatis lorem id faucibus venenatis. Suspendisse sodales, tortor ut condimentum fringilla, turpis erat venenatis justo, lobortis egestas massa massa sed magna. Phasellus in enim vel ante viverra ultricies.

      The output above shows a small fraction of the content of the file printed to the terminal. When you compare the terminal output with the lorem-ipsum.txt file, you will see that the content is the same and takes the same formatting as the file, just like with the cat command.

      In this section, you implemented the cat functionality in your command-line program to read the content of a file and print it to the terminal using the createReadStream function. In the next step, you will create a file based on the input from the terminal using createWriteStream().

      Step 3 — Writing to a File with createWriteStream()

      In this section, you will write input from the terminal to a file using createWriteStream(). The createWriteStream function returns a writable file stream that you can write data to. Like the readable stream in the previous step, this writable stream emits a set of events like error, finish, and pipe. Additionally, it provides the write function for writing data to the stream in chunks or bits. The write function takes in the chunk, which could be a string, a Buffer, <Uint8Array>, or any other JavaScript value. It also allows you to specify an encoding type if the chunk is a string.

      To write input from a terminal to a file, you will create a function called write in your command-line program. In this function, you will create a prompt that receives input from the terminal (until the user terminates it) and writes the data to a file.

      First, you will need to import the readline module at the top of the mycliprogram file. The readline module is a native Node.js module that you can use to receive data from a readable stream like the standard input (stdin) or your terminal one line at a time. Open your mycliprogram file and add the highlighted line :

      node-file-streams/mycliprogram

      #!/usr/bin/env node
      
      const fs = require('fs');
      const readline = require('readline');
      

      Then, add the following code below the read() function.

      node-file-streams/mycliprogram

      ...
      function write(filePath) {
          const writableStream = fs.createWriteStream(filePath);
      
          writableStream.on('error',  (error) => {
              console.log(`An error occured while writing to the file. Error: ${error.message}`);
          });
      }
      

      Here, you are creating a writable stream with the filePath parameter. This file path will be the command-line argument after the write word. You are also listening for the error event if anything goes wrong (for example, if you provide a filePath that does not exist).

      Next, you will write the prompt to receive a message from the terminal and write it to the specified filePath using the readline module you imported earlier. To create a readline interface, a prompt, and to listen for the line event, update the write function as shown in the block:

      node-file-streams/mycliprogram

      ...
      function write(filePath) {
          const writableStream = fs.createWriteStream(filePath);
      
          writableStream.on('error',  (error) => {
              console.log(`An error occured while writing to the file. Error: ${error.message}`);
          });
      
          const rl = readline.createInterface({
              input: process.stdin,
              output: process.stdout,
              prompt: 'Enter a sentence: '
          });
      
          rl.prompt();
      
          rl.on('line', (line) => {
              switch (line.trim()) {
                  case 'exit':
                      rl.close();
                      break;
                  default:
                      sentence = line + 'n'
                      writableStream.write(sentence);
                      rl.prompt();
                      break;
              }
          }).on('close', () => {
              writableStream.end();
              writableStream.on('finish', () => {
                  console.log(`All your sentences have been written to ${filePath}`);
              })
              setTimeout(() => {
                  process.exit(0);
              }, 100);
          });
      }
      

      You created a readline interface (rl) that allows the program to read the standard input (stdin) from your terminal on a line-by-line basis and write a specified prompt string to standard output (stdout). You also called the prompt() function to write the configured prompt message to a new line and to allow the user to provide additional input.

      Then you chained two event listeners together on the rl interface. The first one listens for the line event emitted each time the input stream receives an end-of-line input. This input could be a line feed character (n), the carriage return character (r), or both characters together (rn), and it usually occurs when you press the ENTER or return key on your computer. Therefore, any time you press either of these keys while typing in the terminal, the line event is emitted. The callback function receives a string containing the single line of input line.

      You trimmed the line and checked to see if it is the word exit. If not, the program will add a new line character to line and write the sentence to the filePath using the .write() function. Then you called the prompt function to prompt the user to enter another line of text. If the line is exit, the program calls the close function on the rl interface. The close function closes the rl instance and releases the standard input (stdin) and output (stdout) streams.

      This function brings us to the second event you listened for on the rl instance: the close event. This event is emitted when you call rl.close(). After writing data to a stream, you have to call the end function on the stream to tell your program that it should no longer write data to the writable stream. Doing this will ensure that the data is completely flushed to your output file. Therefore, when you type the word exit, you close the rl instance and stop your writable stream by calling the end function.

      To provide feedback to the user that the program has successfully written all the text from the terminal to the specified filePath, you listened for the finish event on writableStream. In the callback function, you logged a message to the terminal to inform the user when writing is complete. Finally, you exited the process after 100ms to provide enough time for the finish event to provide feedback.

      Finally, to call this function in your mycliprogram, replace the console.log statement in the case 1 block in the switch statement with the new write function, as shown here:

      node-file-streams/mycliprogram

      ...
      switch (command){
          ...
      
          case 1:
              write(args[3]);
              break;
      
          ...
      }
      

      Save the file containing the new changes. Then run the command-line application in your terminal with the write command.

      • ./mycliprogram write output.txt

      At the Enter a sentence prompt, add any input you’d like. After a couple of entries, type exit.

      The output will look similar to this (with your input displaying instead of the highlighted lines):

      Output

      Enter a sentence: Twinkle, twinkle, little star Enter a sentence: How I wonder what you are Enter a sentence: Up above the hills so high Enter a sentence: Like a diamond in the sky Enter a sentence: exit All your sentences have been written to output.txt

      Check output.txt to see the file content using the read command you created earlier.

      • ./mycliprogram read output.txt

      The terminal output should contain all the text you have typed into the command except exit. Based on the input above, the output.txt file has the following content:

      Output

      Twinkle, twinkle, little star How I wonder what you are Up above the hills so high Like a diamond in the sky

      In this step, you wrote to a file using streams. Next, you will implement the function that copies files in your command-line program.

      Step 4 — Copying Files Using pipe()

      In this step, you will use the pipe function to create a copy of a file using streams. Although there are other ways to copy files using streams, using pipe is preferred because you don’t need to manage the data flow.

      For example, one way to copy files using streams would be to create a readable stream for the file, listen to the stream on the data event, and write each chunk from the stream event to a writable stream of the file copy. The snippet below shows an example:

      example.js

      const fs = require('fs');
      const readableStream = fs.createReadStream('lorem-ipsum.txt', 'utf8');
      const writableStream = fs.createWriteStream('lorem-ipsum-copy.txt');
      
      readableStream.on('data', () => {
          writableStream.write(chunk);
      });
      
      writableStream.end();
      

      The disadvantage to this method is that you need to manage the events on both the readable and writeable streams.

      The preferred method for copying files using streams is to use pipe. A plumbing pipe passes water from a source such as a water tank (output) to a faucet or tap (input). Similarly, you use pipe to direct data from an output stream to an input stream. (If you are familiar with the Linux-based bash shell, the pipe | command directs data from one stream to another.)

      Piping in Node.js provides the ability to read data from a source and write it somewhere else without managing the data flow as you would using the first method. Unlike the previous approach, you do not need to manage the events on both the readable and writable streams. For this reason, it is a preferred approach for implementing a copy command in your command-line application that uses streams.

      In the mycliprogram file, you will add a new function invoked when a user runs the program with the copy command-line argument. The copy method will use pipe() to copy from an input file to the destination copy of the file. Create the copy function after the write function as shown below:

      node-file-streams/mycliprogram

      ...
      function copy(filePath) {
          const inputStream = fs.createReadStream(filePath)
          const fileCopyPath = filePath.split('.')[0] + '-copy.' + filePath.split('.')[1]
          const outputStream = fs.createWriteStream(fileCopyPath)
      
          inputStream.pipe(outputStream)
      
          outputStream.on('finish', () => {
              console.log(`You have successfully created a ${filePath} copy. The new file name is ${fileCopyPath}.`);
          })
      }
      

      In the copy function, you created an input or readable stream using fs.createReadStream(). You also generated a new name for the destination, output a copy of the file, and created an output or writable stream using fs.createWriteStream(). Then you piped the data from the inputStream to the outputStream using .pipe(). Finally, you listened for the finish event and printed out a message on a successful file copy.

      Recall that to close a writable stream, you have to call the end() function on the stream. When piping streams, the end() function is called on the writable stream (outputStream) when the readable stream (inputStream) emits the end event. The end() function of the writable stream emits the finish event, and you listen for this event to indicate that you have finished copying a file.

      To see this function in action, open the mycliprogram file and update the case 2 block of the switch statement as shown below:

      node-file-streams/mycliprogram

      ...
      switch (command){
          ...
      
          case 2:
              copy(args[3]);
              break;
      
          ...
      }
      

      Calling the copy function in the case 2 block of the switch statements ensures that when you run the mycliprogram program with the copy command and the required file paths, the copy function is executed.

      Run mycliprogram:

      • ./mycliprogram copy lorem-ipsum.txt

      The output will look similar to this:

      Output

      You have successfully created a lorem-ipsum-copy.txt copy. The new file name is lorem-ipsum-copy.txt.

      Within the node-file-streams folder, you will see a newly added file with the name lorem-ipsum-copy.txt.

      You have successfully added a copy function to your command-line program using pipe. In the next step, you will use streams to modify the content of a file.

      Step 5 — Reversing the Content of a File using Transform()

      In the previous three steps of this tutorial, you have worked with streams using the fs module. In this section, you will modify file streams using the Transform() class from the native stream module, which provides a transform stream. You can use a transform stream to read data, manipulate the data, and provide new data as output. Thus, the output is a ‘transformation’ of the input data. Node.js modules that use transform streams include the crypto module for cryptography and the zlib module with gzip for compressing and uncompressing files.

      You are going to implement a custom transform stream using the Transform() abstract class. The transform stream you create will reverse the contents of a file line by line, which will demonstrate how to use transform streams to modify the content of a file as you want.

      In the mycliprogram file, you will add a reverse function that the program will call when a user passes the reverse command-line argument.

      First, you need to import the Transform() class at the top of the file below the other imports. Add the highlighted line as shown below:

      mycliprogram

      #!/usr/bin/env node
      ...
      const stream = require('stream');
      const Transform = stream.Transform || require('readable-stream').Transform;
      

      In Node.js versions earlier than v0.10, the Transform abstract class is missing. Therefore, the code block above includes the readable-streams polyfill so that this program can work with earlier versions of Node.js. If the Node.js version is > 0.10 the program uses the abstract class, and if not, it uses the polyfill.

      Note: If you are using a Node.js version < 0.10, you will have to run npm init -y to create a package.json file and install the polyfill using npm install readable-stream to your working directory for the polyfill to be applied.

      Next, you will create the reverse function right under your copy function. In that function, you will create a readable stream using the filePath parameter, generate a name for the reversed file, and create a writable stream using that name. Then you create reverseStream, an instance of the Transform() class. When you call the Transform() class, you pass in an object containing one function. This important function is the transform function.

      Beneath the copy function, add the code block below to add the reverse function.

      node-file-streams/mycliprogram

      ...
      function reverse(filePath) {
          const readStream = fs.createReadStream(filePath);
          const reversedDataFilePath = filePath.split('.')[0] + '-reversed.'+ filePath.split('.')[1];
          const writeStream = fs.createWriteStream(reversedDataFilePath);
      
          const reverseStream = new Transform({
              transform (data, encoding, callback) {
                  const reversedData = data.toString().split("").reverse().join("");
                  this.push(reversedData);
                  callback();
              }
          });
      
          readStream.pipe(reverseStream).pipe(writeStream).on('finish', () => {
              console.log(`Finished reversing the contents of ${filePath} and saving the output to ${reversedDataFilePath}.`);
          });
      }
      

      The transform function receives three parameters: data, encoding type, and a callback function. Within this function, you converted the data to a string, split the string, reversed the contents of the resultant array, and joined them back together. This process rewrites the data backward instead of forward.

      Next, you connected the readStream to the reverseStream and finally to the writeStream using two pipe() functions. Finally, you listened for the finish event to alert the user when the file contents have been completely reversed.

      You will notice that the code above uses another syntax for listening for the finish event. Instead of listening for the finish event for the writeStream on a new line, you chained the on function to the second pipe function. You can chain some event listeners on a stream. In this case, doing this has the same effect as calling the on('finish') function on the writeStream.

      To wrap things up, replace the console.log statement in the case 3 block of the switch statement with reverse().

      node-file-streams/mycliprogram

      ...
      switch (command){
          ...
      
          case 3:
              reverse(args[3]);
              break;
      
          ...
      }
      

      To test this function, you will use another file containing the names of countries in alphabetical order (countries.csv). You can download it to your working directory by running the command below.

      • wget https://raw.githubusercontent.com/do-community/node-file-streams/999e66a11cd04bc59843a9c129da759c1c515faf/countries.csv

      You can then run mycliprogram.

      • ./mycliprogram reverse countries.csv

      The output will look similar to this:

      Output

      Finished reversing the contents of countries.csv and saving the output to countries-reversed.csv.

      Compare the contents of countries-reversed.csv with countries.csv to see the transformation. Each name is now written backward, and the order of the names has also been reversed (“Afghanistan” is written as “natsinahgfA” and appears last, and “Zimbabwe” is written as “ewbabmiZ” and appears first).

      You have successfully created a custom transform stream. You have also created a command-line program with functions that use streams for file handling.

      Conclusion

      Streams are used in native Node.js modules and in various yarn and npm packages that perform input/output operations because they provide an efficient way to handle data. In this article, you used various stream-based functions to work with files in Node.js. You built a command-line program with read, write, copy, and reverse commands. Then you implemented each of these commands in functions named accordingly. To implement the functions, you used functions like createReadStream, createWriteStream, pipe from the fs module, the createInterface function from the readline module, and finally the abstract Transform() class. Finally, you pieced these functions together in a small command-line program.

      As a next step, you could extend the command-line program you created to include other file system functionality you might want to use locally. A good example could be writing a personal tool to convert data from .tsv stream source to .csv or attempting to replicate the wget command you used in this article to download files from GitHub.

      The command-line program you have written handles command-line arguments itself and uses a simple prompt to get user input. You can learn more about building more robust and maintainable command-line applications by following How To Handle Command-line Arguments in Node.js Scripts and How To Create Interactive Command-line Prompts with Inquirer.js.

      Additionally, Node.js provides extensive documentation on the various Node.js stream module classes, methods, and events you might need for your use case.



      Source link