Hauptinhalt

distributed

Create and access elements of distributed arrays from client

    Description

    A distributed array on the client represents an array that is partitioned out among the workers in a parallel pool. You operate on the entire array as a single entity; however, workers operate only on their part of the array and automatically transfer data between themselves when necessary. A distributed array resembles a normal MATLAB® array in the way you index and manipulate its elements, but none of its elements exist on the client. Codistributed arrays that you create inside spmd statements are accessible as distributed arrays from the client.

    Creation

    Use the distributed function or use the "distributed" option of array creation functions such as ones or zeros. For a list of array creation functions that create distributed arrays directly on workers, see Alternative Functionality.

    Syntax

    Description

    D = distributed(ds) creates a distributed array from a datastore ds. D is a distributed array stored in parts on the workers of the open parallel pool.

    To retrieve the distributed array elements from the pool back to an array in the MATLAB workspace, use the gather function.

    example

    D = distributed(X) creates a distributed array from an array X.

    Use this syntax to create a distributed array from local data only if the MATLAB client can store all of X in memory. To create large distributed arrays, use the previous syntax to create a distributed array from a datastore, or the "distributed"option of array creation functions such as ones, zeros, or any other creation functions listed in Alternative Functionality.

    If the input argument is already a distributed array, the result is the same as the input.

    example

    D = distributed(C,dim) creates a distributed array from the Composite object C, with the entries of C concatenated and distributed along the dimension dim. If you omit dim, then the first dimension is the distribution dimension.

    All entries of the Composite object must have the same class. Dimensions other than the distribution dimension must be the same.

    example

    D = distributed(tX) converts the tall array tX into a distributed array distributed along the first dimension. tX must be defined in a parallel environment that can run distributed arrays. (since R2023b)

    example

    Input Arguments

    expand all

    Datastore, specified as one of the following objects.

    ObjectType
    TabularTextDatastore objectText files
    ImageDatastore objectImage files
    SpreadsheetDatastore objectSpreadsheet files
    KeyValueDatastore objectMAT files as well as sequence files you produce using mapreduce
    FileDatastore objectCustom format files
    TallDatastore objectMAT-files and sequence files produced by the write function of the tall data type
    ParquetDatastore objectParquet files
    DatabaseDatastore (Database Toolbox) objectDatabase

    Array to distribute, specified as an array.

    Composite object to distribute, specified as a Composite object.

    Distribution dimension, specified as a scalar integer. The distribution dimension specifies the dimension over which you distribute the Composite object.

    Since R2023b

    Tall array to convert to a distributed array, specified as a tall array. The tall array must be defined in a parallel environment that supports distributed arrays.

    Output Arguments

    expand all

    Distributed array stored in parts on the workers of the open parallel pool, returned as a distributed array.

    Object Functions

    gatherTransfer distributed array, Composite object, or gpuArray object to local workspace
    writeWrite distributed data to an output location

    Several MATLAB toolboxes include functions with distributed array support. For a list of functions in all MathWorks® products that support distributed arrays, see All Functions List (Distributed Arrays).

    Several object functions enable you to examine the characteristics of a distributed array. Most behave like the MATLAB functions of the same name.

    isdistributedTrue for distributed array
    isrealDetermine whether array uses complex storage
    isUnderlyingTypeDetermine whether input has specified underlying data type
    lengthLength of largest array dimension
    ndimsNumber of array dimensions
    sizeArray size
    underlyingTypeType of underlying data determining array behavior

    Examples

    collapse all

    This example shows how to create and load distributed arrays using datastore.

    First, create a datastore using an example data set. This data set is too small to show equal partitioning of the data over the workers. To simulate a large data set, artificially increase the size of the datastore using repmat.

    files = repmat("airlinesmall.csv",10,1);
    ds = tabularTextDatastore(files);
    

    Select the example variables.

    ds.SelectedVariableNames = ["DepTime", "DepDelay"];
    ds.TreatAsMissing = "NA";
    

    Create a distributed table by reading the datastore in parallel. Partition the datastore with one partition per worker. Each worker then reads all data from the corresponding partition. The files must be in a shared location accessible from the workers.

    dt = distributed(ds);
    Starting parallel pool (parpool) using the 'Processes' profile ... connected to 4 workers.

    Finally, display summary information about the distributed table.

    summary(dt) 
    Variables:
    
        DepTime: 1,235,230×1 double
            Values:
    
                min          1
                max       2505
                NaNs    23,510
    
        DepDelay: 1,235,230×1 double
            Values:
    
                min      -1036
                max       1438
                NaNs    23,510
    

    Follow Lee on X/Twitter - Father, Husband, Serial builder creating AI, crypto, games & web tools. We are friends :) AI Will Come To Life!

    Check out: eBank.nz (Art Generator) | Netwrck.com (AI Tools) | Text-Generator.io (AI API) | BitBank.nz (Crypto AI) | ReadingTime (Kids Reading) | RewordGame | BigMultiplayerChess | WebFiddle | How.nz | Helix AI Assistant