Personal Project

Thursday, May 25, 2017

Solutions to Solve Writing Contention Problems in Google Datastore

Google Cloud Platform provides lots of technologies to save developer`s time while thinking about how to scale or maintain our services to serve millions of thousands of users. If you are familiar with MYSQL`s architecture, you will face different kinds of problems to serve a high volume of users. The reason is MYSQL implements the blocking mode when writing data to databases. Google Datastore solved this issue and can write a lot of data in parallel. Especially, Google Datastore is designed for high availability and scalability. 

However, you must test and know deeply about how to write simultaneous data in datastore. Otherwise, your services could only serve current 5 requests per second. It seems like a buggy solution. What is worst, Google's documentations do not give you an easy-to-understand example to implement Sharding Counter. You may wonder how to apply this technique to your entity model and how to avoid this common mistake.


Limitations of Google Datastore
  • Any entity group can only be written at rate 1 request per second.
  • If using @ndb.transactional or @ndb.transactional(xg=true) to write the data, your API can only serve current 5 requests per second. Otherwise, you will get an error of writing contention in datastore.

Why is writing data in Datastore so slow?

Because Datastore needs to copy your data globally and make your data with high availability.



Solutions to Solve Writing Contention Problems

  • Sharding Counter 
  • Use Memcache to batch writing requests and do all the operations in memory and return back to your clients
  • Defer a task queue to write data in datastore


In fact, Sharding Counter is just an example provided by Google. The key point is we can do sharding on our entity group with a unique id as shown below. 

The following codes show how to simultaneously write a thousand of Friendship entities in parallel. If we need to improve its performance, just increase the number of NUM_SHARDS.


NUM_SHARDS = 1000
shard_string_index = str(random.randint(0, NUM_SHARDS - 1))
FriendShip(id=shard_string_index,
           user_key='user Id', 
           friend_key='frind Id')


If you have lots of data models needed to update in one request, please use the task queue to update and return back to your clients only a few amount of information.

If you need to write transactional data using @ndb.transactional or @ndb.transactional(xg=true), defer a task queue to get it done and return a few amount of information to the clients.

Depending on your data model's design, you can use Sharding Counter or Memcache or Task Queue or a hybrid approach to performing the best performance using Google's Datastore and Google App Engine.



Monday, May 1, 2017

How to make TURN Server for high availability?

If you want to keep your WebRTC video streaming services online without any downtime, you must pay attention to the availability of TURN Server. Because TURN Server plays an important to help two parties to connect to each other with Video or Audio streaming in different NAT networks.

The following instructions show how to automatically monitor your TURM server and restart it during the downtime.


1. Install pexpect lib in Python 

sudo pip install pexpect --upgrade



2. Edit MonitorStun.py 
- Telnet your TURN Serer 
- If it is down, ssh to your server and restart it  

#!/usr/bin/env python
import socket
import subprocess
import sys
from datetime import datetime
from pexpect import pxssh


# SSH TO TURN SERVER and restart it
def connect_turn_server():
  s = pxssh.pxssh()

  if not s.login ('TURN Server IP', 'SERVER PORT', 'ACCOUNT', 'PASSWORD'):
    print "SSH session failed on login."
    print str(s)
  else:
    print "SSH session login TURN successful"
    s.sendline ('sudo turnserver -c /usr/local/etc/turnserver.conf --daemo')
    s.prompt()         # match the prompt
    print s.before     # print everything before the prompt.
    s.logout()


# Telnet TURN Server to check it is alive or not on PORT 3478 or 3479
# Clear the screen
subprocess.call('clear', shell=True)

# Ask for input
remoteServer    = 'SERVER IP'
remoteServerIP  = socket.gethostbyname(remoteServer)

# Print a nice banner with information on which host we are about to scan
print "-" * 60
print "Please wait, scanning remote host", remoteServerIP
print "-" * 60

# Check what time the scan started
t1 = datetime.now()

# Using the range function to specify ports (here it will scans all ports between 1 and 1024)

# We also put in some error handling for catching errors

try:
    for port in range(3478,3479):
        sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        result = sock.connect_ex((remoteServerIP, port))
        if result == 0:
            print "Port {}:      Open".format(port)
        else:
             print "TURN Server is down"
             connect_turn_server()
             print "restart TURN Server OK"
        sock.close()


except KeyboardInterrupt:
 print "You pressed Ctrl+C"
    sys.exit()

except socket.error:
    print "Couldn't connect to server"
    sys.exit()

                                        
3. Add MonitorStun.py to con job to check TURN Server in every 1 min.


*/1 * * * * /your_path/monitorStun.py

Of course, you can apply this technique to monitor any services such as SIP Proxy with port 5060, Apache with port 80, or Tomcat with port 8080.