Python scientific libraries

Python scientific libraries#

Setelah membaca ini, pembaca diharapkan dapat memahami penggunaan library NumPy, Pandas, dan Matplotlib untuk pengolahan data.

Mengapa menggunakan libraries?#

Library diperlukan untuk memudahkan pemrosesan data sesuai dengan fokus spesifik yang diperlukan.

Library python yang umum digunakan dalam pengolahan data meteorologi maritim.

No	Library	Penggunaan
1	NumPy	Manipulasi array
2	Pandas	manipulasi data tabular
3	Matplotlib	visualisasi data
	—Tidak disampaikan dalam Mata Pelatihan 1, namun di Mata Pelatihan 2—
4	Xarray	manipulasi data multidimensi
5	Cartopy	visualisasi map
6	Geopandas	manipulasi data geospasial

NumPy#

NumPy menyediakan dukungan untuk array, yang lebih efisien dan praktis daripada list Python untuk data numerik.

Konsep: Membuat array, atribut array, operasi array dasar.

# Mengimport library
import numpy as np

Creating NumPy array#

# Create a one-dimensional array of integer
arr = np.array([1,2,3,4,5])
print(arr

# Create a 2D array of floats
arr = np.array([[1.1, 2.3],[3.1,4.2]])
print(arr)

  Cell In[2], line 3
    print(arr
         ^
SyntaxError: '(' was never closed

Indexing#

# Indexing
# Create a 2D array
arr = np.aray([[1,2,3],[4,5,6],[7,8,9]])
print(arr)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

# Access the element in the first row, second column
selected = arr[0,1]
print(selected)

Slicing#

# Slicing
# Create 1D array
arr = np.arange(10)
print(arr)

# Access the FIRST 5 elements
print(arr:5])

# Access the LAST 5 elements
print(arr[-5:])

# Access elements having even index
print(arr[::2])

[0 1 2 3 4 5 6 7 8 9]
[0 1 2 3 4]
[5 6 7 8 9]
[0 2 4 6 8]

Math operations#

# Mathematical Operations
# Create 1D array
arr = np.array([1,2,3,4,5])
print(arr)

# Operation of adding the value of each element
print(arr2)

# Operation of multiplication of each element
print(arr*3)

# Compute dot products of the array with itself
print(np.dot(arr,arr))

[1 2 3 4 5]
[3 4 5 6 7]
[ 3  6  9 12 15]
55

Aggregation#

# Aggregation Operations
# Create 1D array
arr = np.array([1,2,3,4,5])
print(arr)

# Finding max value
print(np.maxarr))

# Finding min value
print(np.min(arr))

# Finding standard deviation
print(np.std(arr))

[1 2 3 4 5]
5
1
1.4142135623730951

Reshaping and Transposing#

# Create 1D array
arr = np.array([1,2,3,4,5,6])
print(arr)

# Reshape the array into a 2D array
result = arr.reshape(2,3)
print(result)

# Transpose array
print(result.T)

[1 2 3 4 5 6]
[[1 2 3]
 [4 5 6]]
[[1 4]
 [2 5]
 [3 6]]

Pandas#

# Mengimport library
import padas as pd

Series#

Series adalah struktur data 1 dimensi seperti array atau list, tetapi memiliki index.

# Create a Series
s = pd.Series(
    [1, 3, 5, np.nan, 6, 8]
)
print(s)

  1.0
  3.0
  5.0
  NaN
  6.0
  8.0
dtype: float64

DataFrame#

DataFrame adalah struktur data 2 dimensi, seperti tabel Excel.

# Membuat dataframe
data = {'temp': [27,30,29]
        'rh': [70,75,80],
        'wx': ['clear','cloudy','rain']}

df = pd.DataFrame(data,index=['Jakarta','Tokyo','NewYork'])
df

	temp	rh	wx
Jakarta	27	70	clear
Tokyo	30	75	cloudy
NewYork	29	80	rain

print(df)

         temp  rh      wx
Jakarta    27  70   clear
Tokyo      30  75  cloudy
NewYork    29  80    rain

df.describe()

	temp	rh
count	3.000000	3.0
mean	28.666667	75.0
std	1.527525	5.0
min	27.000000	70.0
25%	28.000000	72.5
50%	29.000000	75.0
75%	29.500000	77.5
max	30.000000	80.0

Indexing & Slicing#

Mengakses Kolom#

df['temp']     # atau df.temp

Jakarta    27
Tokyo      30
NewYork    29
Name: temp, dtype: int64

Mengakses Baris dengan Label (`.loc`)#

df.loc['Tokyo']

temp        30
rh          75
wx      cloudy
Name: Tokyo, dtype: object

df.loc[['Tokyo', 'NewYork']]

	temp	rh	wx
Tokyo	30	75	cloudy
NewYork	29	80	rain

Mengakses Baris dengan Posisi (`.iloc`)#

# mengakses baris posisi ke 0
df.iloc[0]

temp       27
rh         70
wx      clear
Name: Jakarta, dtype: object

# mengakses baris posisi ke 1 sampai 2
df.iloc[1:3]

	temp	rh	wx
Tokyo	30	75	cloudy
NewYork	29	80	rain

Mengakses Sel Tertentu (Baris & Kolom Sekaligus)#

# mengakses nilai temp di baris Jakarta.
df.loc['Jakarta', 'temp']

np.int64(27)

# mengakses nilai rh di baris ke-2 (NewYork).
df.iloc[2, 1]

np.int64(80)

Slicing Kolom dan Baris Sekaligus#

# Mengakses subset baris dari Tokyo sampai NewYork dan kolom dari temp sampai rh.
df.loc['Tokyo':'NewYork', 'temp':'rh']

	temp	rh
Tokyo	30	75
NewYork	29	80

Filtering#

# Menampilkan kota-kota dengan suhu (temp) lebih dari 28°C.
df[df['temp'] > 28]

	temp	rh	wx
Tokyo	30	75	cloudy
NewYork	29	80	rain

Math operations#

comfort_index = 0.5*(df['temp']+df['rh'])
df['comfort_index'] = comfort_index
df

	temp	rh	wx	comfort_index
Jakarta	27	70	clear	48.5
Tokyo	30	75	cloudy	52.5
NewYork	29	80	rain	54.5

df.loc['Jakarta', 'temp'] += 2
df.loc['Tokyo', 'temp'] -= 2
df

	temp	rh	wx	comfort_index
Jakarta	29	70	clear	48.5
Tokyo	28	75	cloudy	52.5
NewYork	29	80	rain	54.5

Agregasi#

RHmin = df['rh'].min()
RHmean = df['rh'].mean()
RHmax = df['rh'].max()
RHQ1 = df['rh'].quantile(0.25)
RHQ3 = df['rh'].quantile(0.75)
print(RHmin,RHmean,RHmax,RHQ1,RHQ3)

70 75.0 80 72.5 77.5

💡 WORTH TO TRY: Numpy x Pandas#

Buatlah sebuah dataframe berisi 2 kolom: bilangan ganjil 1 - 100 dan bilangan genap 1 - 100.
Buatlah 1 kolom baru berisi hasil perhitungan 2*(elemen ganjil + elemen genap)
Buatlah 1 kolom baru berisi Boolean untuk menandakan apakah bilangan ganjil di baris yang sama bilangan prima. Hint: buat fungsi dan gunakan pandas apply method

Matplotlib#

matplotlib adalah library visualisasi data yang paling umum digunakan di Python. Modul utamanya adalah pyplot, yang sering diimpor sebagai plt.

import matplotlib.pyplot as plt

Membuat Plot Sederhana#

Gunakan plt.subplots() untuk membuat figure dan axes, kemudian gunakan .plot() untuk menggambar data.

# Mengimport library
import matplotlib.pyplot as plt

fig, ax = plt.subplots()             # Create a figure containing a single Axes.
ax.plot([1, 2, 3, 4], [1, 4, 2, 3])  # Plot some data on the Axes.
plt.show()                           # Show the figure.

../_images/16cfe7ee508c68b55e7043cf18fb6f8cc3356e3b06a6d84e16dad1d37f12e266.png

Anatomy dari Figure#

Setiap plot terdiri dari komponen:

Figure: kanvas utama
Axes: area plot
Title: judul plot (ax.set_title)
X & Y label: nama sumbu (ax.set_xlabel, ax.set_ylabel)
Legend, Grid, Spines, Ticks, dll.

📌 Lihat gambar berikut untuk referensi visual:

Mengatur Ukuran dan Tata Letak Figure#

Gunakan figsize=(lebar, tinggi) saat membuat figure.

fig = plt.figure()
plt.show()

<Figure size 640x480 with 0 Axes>

fig = plt.figure(figsize=(13,5))

<Figure size 1300x500 with 0 Axes>

Menambahkan Beberapa Axes#

Gunakan fig.add_axes() untuk menambahkan beberapa panel secara bebas.

np.linspace(0, 100, 11)
np.zeros((2, 1), int)

fig = plt.figure()
ax = fig.add_axes([0,0,1,1])

../_images/226976d65320f4bbd9a1c71ce918eac6430ca9434a9adb9e51066b5581252636.png

fig = plt.figure()
ax1 = fig.add_axes([0,0,1,1])
ax2 = fig.add_axes([0.6,0,0.3,0.5], facecolor='b')

../_images/909462a71491243ffe4fc9aa792add9db219f3ddd5ddd38fa31772e7c24380dc.png

Subplots: Beberapa Grafik dalam Satu Gambar#

Gunakan plt.subplots(nrows, ncols) untuk membuat grid plot otomatis.

fig = plt.figure(figsize=(12,6))
axes = fig.subplots(nrows=2, ncols=3)

../_images/dac6f6bccd3602bafcc49e9c893f10076c8ad1e8141280e745386a6d4b004373.png

fig,ax = plt.subplots(figsize=(8,2), nrows=1, ncols=2)
ax[0].plot([1,2,3],[4,5,1]
ax[0].plot([1,2,3],[3,2,4])
ax[1].plot([1,2,3],[4,5,1])
ax[1].plot([1,2,3],[3,2,4])
plt.show()

../_images/46893e7aaaf878c09332f839a93fcac8ebb2e48628f7e1354c4e6eaf041c79d2.png

Menambahkan Label, Judul, dan Anotasi#

fig,ax = plt.subplots(figsize=(9,2),ncols=2)
ax0,ax1 = ax

ax0.plot([1,2,3],[4,5,1])
ax0.set_xlabel('bulan')
ax0.set_ylabe('kualitas')
ax0.set_title('series kualitas')

ax1.plot([1,2,3],[3,2,4])
ax1.set_xlabel('bulan')
ax1.set_ylabel('kuantitas')
ax1.set_title('series kuantitas')

plt.show()

../_images/59f532f4f8c9173859c0e1fc40407ed171fc2e3f69987d5a571df48b64657438.png

import numpy as np
x = np.linspace(-np.pi, np.pi, 100)
y = np.cos(x)
z = np.sin(6*x)

fig, ax = plt.subplots()
ax.plot(x, y)
ax.text(-3, 0.3, 'hello world')
ax.annotate('the maximum', xy=(0, 1),
             xytext=(0, 0), arrowprops={'facecolor': 'k'})

Text(0, 0, 'the maximum')

../_images/298eb98358caafd845cc0370d23f65ff543a2f9a83c9aa23f7c2d80c421c5bb3.png

E. Tips#

1. Memanggil method yang tersimpan dalam library#

Method dapat dipanggil dengan menambahkan . kemudian tab keyboard.

***

np.

2. Mendapatkan bantuan#

Bantuan menggunakan command help atau ?.

help(np.array)

Help on built-in function array in module numpy:

array(...)
    array(object, dtype=None, *, copy=True, order='K', subok=False, ndmin=0,
          like=None)

    Create an array.

    Parameters
    ----------
    object : array_like
        An array, any object exposing the array interface, an object whose
        ``__array__`` method returns an array, or any (nested) sequence.
        If object is a scalar, a 0-dimensional array containing object is
        returned.
    dtype : data-type, optional
        The desired data-type for the array. If not given, NumPy will try to use
        a default ``dtype`` that can represent the values (by applying promotion
        rules when necessary.)
    copy : bool, optional
        If ``True`` (default), then the array data is copied. If ``None``,
        a copy will only be made if ``__array__`` returns a copy, if obj is
        a nested sequence, or if a copy is needed to satisfy any of the other
        requirements (``dtype``, ``order``, etc.). Note that any copy of
        the data is shallow, i.e., for arrays with object dtype, the new
        array will point to the same objects. See Examples for `ndarray.copy`.
        For ``False`` it raises a ``ValueError`` if a copy cannot be avoided.
        Default: ``True``.
    order : {'K', 'A', 'C', 'F'}, optional
        Specify the memory layout of the array. If object is not an array, the
        newly created array will be in C order (row major) unless 'F' is
        specified, in which case it will be in Fortran order (column major).
        If object is an array the following holds.

        ===== ========= ===================================================
        order  no copy                     copy=True
        ===== ========= ===================================================
        'K'   unchanged F & C order preserved, otherwise most similar order
        'A'   unchanged F order if input is F and not C, otherwise C order
        'C'   C order   C order
        'F'   F order   F order
        ===== ========= ===================================================

        When ``copy=None`` and a copy is made for other reasons, the result is
        the same as if ``copy=True``, with some exceptions for 'A', see the
        Notes section. The default order is 'K'.
    subok : bool, optional
        If True, then sub-classes will be passed-through, otherwise
        the returned array will be forced to be a base-class array (default).
    ndmin : int, optional
        Specifies the minimum number of dimensions that the resulting
        array should have.  Ones will be prepended to the shape as
        needed to meet this requirement.
    like : array_like, optional
        Reference object to allow the creation of arrays which are not
        NumPy arrays. If an array-like passed in as ``like`` supports
        the ``__array_function__`` protocol, the result will be defined
        by it. In this case, it ensures the creation of an array object
        compatible with that passed in via this argument.

        .. versionadded:: 1.20.0

    Returns
    -------
    out : ndarray
        An array object satisfying the specified requirements.

    See Also
    --------
    empty_like : Return an empty array with shape and type of input.
    ones_like : Return an array of ones with shape and type of input.
    zeros_like : Return an array of zeros with shape and type of input.
    full_like : Return a new array with shape of input filled with value.
    empty : Return a new uninitialized array.
    ones : Return a new array setting values to one.
    zeros : Return a new array setting values to zero.
    full : Return a new array of given shape filled with value.
    copy: Return an array copy of the given object.


    Notes
    -----
    When order is 'A' and ``object`` is an array in neither 'C' nor 'F' order,
    and a copy is forced by a change in dtype, then the order of the result is
    not necessarily 'C' as expected. This is likely a bug.

    Examples
    --------
    >>> import numpy as np
    >>> np.array([1, 2, 3])
    array([1, 2, 3])

    Upcasting:

    >>> np.array([1, 2, 3.0])
    array([ 1.,  2.,  3.])

    More than one dimension:

    >>> np.array([[1, 2], [3, 4]])
    array([[1, 2],
           [3, 4]])

    Minimum dimensions 2:

    >>> np.array([1, 2, 3], ndmin=2)
    array([[1, 2, 3]])

    Type provided:

    >>> np.array([1, 2, 3], dtype=complex)
    array([ 1.+0.j,  2.+0.j,  3.+0.j])

    Data-type consisting of more than one element:

    >>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])
    >>> x['a']
    array([1, 3], dtype=int32)

    Creating an array from sub-classes:

    >>> np.array(np.asmatrix('1 2; 3 4'))
    array([[1, 2],
           [3, 4]])

    >>> np.array(np.asmatrix('1 2; 3 4'), subok=True)
    matrix([[1, 2],
            [3, 4]])

# atau menggunakan ?
np?

Type:        module
String form: <module 'numpy' from '/home/tyo/miniconda3/envs/ofs/lib/python3.13/site-packages/numpy/__init__.py'>
File:        ~/miniconda3/envs/ofs/lib/python3.13/site-packages/numpy/__init__.py
Docstring:  
NumPy
=====

Provides
  1. An array object of arbitrary homogeneous items
  2. Fast mathematical operations over arrays
  3. Linear Algebra, Fourier Transforms, Random Number Generation

How to use the documentation
----------------------------
Documentation is available in two forms: docstrings provided
with the code, and a loose standing reference guide, available from
`the NumPy homepage <https://numpy.org>`_.

We recommend exploring the docstrings using
`IPython <https://ipython.org>`_, an advanced Python shell with
TAB-completion and introspection capabilities.  See below for further
instructions.

The docstring examples assume that `numpy` has been imported as ``np``::

  >>> import numpy as np

Code snippets are indicated by three greater-than signs::

  >>> x = 42
  >>> x = x + 1

Use the built-in ``help`` function to view a function's docstring::

  >>> help(np.sort)
  ... # doctest: +SKIP

For some objects, ``np.info(obj)`` may provide additional help.  This is
particularly true if you see the line "Help on ufunc object:" at the top
of the help() page.  Ufuncs are implemented in C, not Python, for speed.
The native Python help() does not know how to view their help, but our
np.info() function does.

Available subpackages
---------------------
lib
    Basic functions used by several sub-packages.
random
    Core Random Tools
linalg
    Core Linear Algebra Tools
fft
    Core FFT routines
polynomial
    Polynomial tools
testing
    NumPy testing tools
distutils
    Enhancements to distutils with support for
    Fortran compilers support and more (for Python <= 3.11)

Utilities
---------
test
    Run numpy unittests
show_config
    Show numpy build configuration
__version__
    NumPy version string

Viewing documentation using IPython
-----------------------------------

Start IPython and import `numpy` usually under the alias ``np``: `import
numpy as np`.  Then, directly past or use the ``%cpaste`` magic to paste
examples into the shell.  To see which functions are available in `numpy`,
type ``np.<TAB>`` (where ``<TAB>`` refers to the TAB key), or use
``np.*cos*?<ENTER>`` (where ``<ENTER>`` refers to the ENTER key) to narrow
down the list.  To view the docstring for a function, use
``np.cos?<ENTER>`` (to view the docstring) and ``np.cos??<ENTER>`` (to view
the source code).

Copies vs. in-place operation
-----------------------------
Most of the functions in `numpy` return a copy of the array argument
(e.g., `np.sort`).  In-place versions of these functions are often
available as array methods, i.e. ``x = np.array([1,2,3]); x.sort()``.
Exceptions to this rule are documented.

Python scientific libraries

Contents

Python scientific libraries#

Mengapa menggunakan libraries?#

NumPy#

Creating NumPy array#

Indexing#

Slicing#

Math operations#

Aggregation#

Reshaping and Transposing#

Pandas#

Series#

DataFrame#

Indexing & Slicing#

Mengakses Kolom#

Mengakses Baris dengan Label (.loc)#

Mengakses Baris dengan Posisi (.iloc)#

Mengakses Sel Tertentu (Baris & Kolom Sekaligus)#

Slicing Kolom dan Baris Sekaligus#

Filtering#

Math operations#

Agregasi#

💡 WORTH TO TRY: Numpy x Pandas#

Matplotlib#

Membuat Plot Sederhana#

Anatomy dari Figure#

Mengatur Ukuran dan Tata Letak Figure#

Menambahkan Beberapa Axes#

Subplots: Beberapa Grafik dalam Satu Gambar#

Menambahkan Label, Judul, dan Anotasi#

E. Tips#

1. Memanggil method yang tersimpan dalam library#

2. Mendapatkan bantuan#

Mengakses Baris dengan Label (`.loc`)#

Mengakses Baris dengan Posisi (`.iloc`)#