Tutorial#

Caterva functions let users to perform different operations with Caterva arrays like setting, copying or slicing them. In this section, we are going to see how to create and manipulate a Caterva array in a simple way.

import caterva as cat

cat.__version__
'0.7.4.dev0'

Creating an array#

First, we create an array, with zero being used as the default value for uninitialized portions of the array.

c = cat.zeros((10000, 10000), itemsize=4, chunks=(1000, 1000), blocks=(100, 100))

c
<caterva.ndarray.NDArray at 0x7fd664070e60>

Reading and writing data#

We can access and edit Caterva arrays using NumPy.

import struct
import numpy as np

dtype = np.int32

c[0, :] = np.arange(10000, dtype=dtype)
c[:, 0] = np.arange(10000, dtype=dtype)
c[0, 0]
array(b'', dtype='|S4')
np.array(c[0, 0]).view(dtype)
array(0, dtype=int32)
np.array(c[0, -1]).view(dtype)
array(9999, dtype=int32)
np.array(c[0, :]).view(dtype)
array([   0,    1,    2, ..., 9997, 9998, 9999], dtype=int32)
np.array(c[:, 0]).view(dtype)
array([   0,    1,    2, ..., 9997, 9998, 9999], dtype=int32)
np.array(c[:]).view(dtype)
array([[   0,    1,    2, ..., 9997, 9998, 9999],
       [   1,    0,    0, ...,    0,    0,    0],
       [   2,    0,    0, ...,    0,    0,    0],
       ...,
       [9997,    0,    0, ...,    0,    0,    0],
       [9998,    0,    0, ...,    0,    0,    0],
       [9999,    0,    0, ...,    0,    0,    0]], dtype=int32)

Persistent data#

When we create a Caterva array, we can we can specify where it will be stored. Then, we can access to this array whenever we want and it will still contain all the data as it is stored persistently.

c1 = cat.full((1000, 1000), fill_value=b"pepe", chunks=(100, 100), blocks=(50, 50),
             urlpath="cat_tutorial.caterva")
c2 = cat.open("cat_tutorial.caterva")

c2.info
TypeNDArray
Itemsize4
Shape(1000, 1000)
Chunks(100, 100)
Blocks(50, 50)
Comp. codecLZ4
Comp. level5
Comp. filters[SHUFFLE]
Comp. ratio588.24
np.array(c2[0, 20:30]).view("S4")
array([b'pepe', b'pepe', b'pepe', b'pepe', b'pepe', b'pepe', b'pepe',
       b'pepe', b'pepe', b'pepe'], dtype='|S4')
import os
if os.path.exists("cat_tutorial.caterva"):
  cat.remove("cat_tutorial.caterva")

Compression params#

Here we can see how when we make a copy of a Caterva array we can change its compression parameters in an easy way.

b = np.arange(1000000).tobytes()

c1 = cat.from_buffer(b, shape=(1000, 1000), itemsize=8, chunks=(500, 10), blocks=(50, 10))

c1.info
TypeNDArray
Itemsize8
Shape(1000, 1000)
Chunks(500, 10)
Blocks(50, 10)
Comp. codecLZ4
Comp. level5
Comp. filters[SHUFFLE]
Comp. ratio6.64
c2 = c1.copy(chunks=(500, 10), blocks=(50, 10),
             codec=cat.Codec.ZSTD, clevel=9, filters=[cat.Filter.BITSHUFFLE])

c2.info
TypeNDArray
Itemsize8
Shape(1000, 1000)
Chunks(500, 10)
Blocks(50, 10)
Comp. codecZSTD
Comp. level9
Comp. filters[BITSHUFFLE]
Comp. ratio20.81

Metalayers#

Metalayers are small metadata for informing about the properties of data that is stored on a container. The metalayers of a Caterva array are also easy to access and edit by users.

from msgpack import packb, unpackb
meta = {
    "dtype": packb("i8"),
    "coords": packb([5.14, 23.])
}
c = cat.zeros((1000, 1000), 5, chunks=(100, 100), blocks=(50, 50), meta=meta)
len(c.meta)
3
c.meta.keys()
['caterva', 'dtype', 'coords']
for key in c.meta:
    print(f"{key} -> {unpackb(c.meta[key])}")
caterva -> [0, 2, [1000, 1000], [100, 100], [50, 50]]
dtype -> i8
coords -> [5.14, 23.0]
c.meta["coords"] = packb([0., 23.])
for key in c.meta:
    print(f"{key} -> {unpackb(c.meta[key])}")
caterva -> [0, 2, [1000, 1000], [100, 100], [50, 50]]
dtype -> i8
coords -> [0.0, 23.0]

Small tutorial#

In this example it is shown how easy is to create a Caterva array from an image and how users can manipulate it using Caterva and Image functions.

from PIL import Image
im = Image.open("../_static/blosc-logo_128.png")

im
../_images/12ab473f719d865ad1ac71b6767b550d4a947b4d58913c1e432a9f01eed41f91.png
meta = {"dtype": b"|u1"}

c = cat.asarray(np.array(im), chunks=(50, 50, 4), blocks=(10, 10, 4), meta=meta)

c.info
TypeNDArray
Itemsize1
Shape(70, 128, 4)
Chunks(50, 50, 4)
Blocks(10, 10, 4)
Comp. codecLZ4
Comp. level5
Comp. filters[SHUFFLE]
Comp. ratio4.31
im2 = c[15:55, 10:35]  # Letter B

Image.fromarray(np.array(im2).view(c.meta["dtype"]))
../_images/11e439800d2efc7fce4fb3be0322014c58f022c7b5a187ef0249280c49faf867.png