Improve Python Base64 to Encode String Safely: Replace +, / and = Characters- Python Tutorial

By | August 21, 2019

In python, we can use base64 to encode a string and transfer it. To encode and decode a string with base64, we can read this tutorial.

A Simple Guide to Python Base64 Encode String for Beginners – Python Tutorial

However, the basic base64 function is not safe,  as to function:

base64.b64encode(s, altchars=None)

This function can encode a string with base64, however, the result contains some characters, such as +, / and =. They are not safe to url.

Here is an example.

import base64
str = 'https://www.example.com/c%c/c++/?id=1&p=3'
base64_nosafe = base64.b64encode(str.encode(encoding='utf-8', errors='strict'))
print(base64_nosafe)

The encode result is:

b’aHR0cHM6Ly93d3cuZXhhbXBsZS5jb20vYyVjL2MrKy8/aWQ9MSZwPTM=

Here, we can find / and = in result.

To avoid +, / characters,  base64 library provides url safe function.

base64.urlsafe_b64encode(s)

Here is an example to show how to use it.

base64_safe = base64.urlsafe_b64encode(str.encode(encoding='utf-8', errors='strict'))
print(base64_safe)

The safe encode result is:

b’aHR0cHM6Ly93d3cuZXhhbXBsZS5jb20vYyVjL2MrKy8_aWQ9MSZwPTM=

However, there is also ‘=‘ in result?

How to improve base64 encode and decode string to avoid +/  and =?

Here we write two functions to improve base64 encode and decode string safely.

Improved base64 encode

def urlsafe_b64encode(data):
    data = base64.b64encode(data.encode("utf-8"))
    data = data.decode("utf-8")
    data = data.replace("+",'-')
    data = data.replace("/",'_')
    data = data.replace("=",'')
    return data

Improved base64 decode

def urlsafe_b64decode(str):
    data = str
    data = data.replace("-",'+')
    data = data.replace("_",'/')
    mod4 = len(data) % 4
    if(mod4):
        temp = '===='
        data += temp[0:mod4]
    data = base64.b64decode(data.encode("utf-8"))
    return data.decode("utf-8")

How to use?

safe_encode  = urlsafe_b64encode(str)
print(safe_encode)
safe_decode  = urlsafe_b64decode(safe_encode)
print(safe_decode)

The result is:

aHR0cHM6Ly93d3cuZXhhbXBsZS5jb20vYyVjL2MrKy8_aWQ9MSZwPTM
https://www.example.com/c%c/c++/?id=1&p=3

From the encode result, we can find there are not +, / and = in result.

Leave a Reply