Efficient implementation of multidimensional array

Utwig · October 22nd, 2004, 10:18 AM

Hello everybody!
I have the problem of effective implementation of multidimensional arrays in C#. In my application I'm using arrays with doubles as a fundamental structure. The number of dimensions and their length is known only in runtime, but typical array is approximately 4-10 dimensional with the length 1-15 for each dimension. Currently (because of historical reasons) these "arrays" are implemented as nested ArrayLists. This means there is a root-level ArrayList that contains "references" to the array lists of the second level and so on (the array list of the last level contains the double values).
Correspondly, if I want to get an element with "coordinates" [2, 3, 4, 5] in a 4-dimensional space I need to run through the levels of array list taking the arraylist with index 2 from the root list, from it the array list with index 3, from the last the array list with index 4, and from the very last array-list the double value with index 5. This is certainly quite ineffective, and I tested an alternative with built-in multidimensional array that is created with the following construction:
System.Array tempSpace = Array.CreateInstance(System.Type.GetType("System.D ouble"), allDimensions);
where allDimensions is a one-dimensional array with ints - dimensÃons for my multidimensional space.
Suprisingly, I found only 2-3 times difference in comparison with implementation on array lists (I tested looping through all elements, incrementing all elements, random access. Code can be provided if needed).
I suspect that the performance problem with built-in multidimensional array is mainly occured by boxing/inboxing (I do not know how to create a multidimensional array so that it will not require boxing/inboxing not knowing the number of dimensions in compile-time). Accordingly, I have following questions:
1. Is it possible to create a multidimensional array so that it will not require boxing/inboxing not knowing the number of dimensions in compile-time? (type of elements is known in compile time - double).
2. If the answer on the second question is no, what will be the better implementation for multidimensional array of doubles of my size (4-10 dimensions with the length 1-15 of each dimension):
a) Built-in multidimensional arrays that suffer (supposedly) from boxing/inboxing.
b) Large one-dimension array, and I need to write some code for index managing (calculating the position of element in one-dimensional array that corresponds to the position in the multidimensional array given its indices).
c) Your solution.
So, with my solutions on the one hand I have boxing/unboxing issue, on the other hand - index managing (which also requires some computational resources) and my time for implementing the solution.
I expect the following functionality from the implementation:
1. Fast random access (write and read) basing on given indices characterizing the position in the multidimensional space (indices are not predictable - random).
2. Fast sequential access (write and read) basing on given indices. By "sequential" here I mean that I will start at some position of the space and get all "neighboring" positions sequentially in some dimension (get a slice from the space).
3. The dimensions of the array are quite rigid and basically are not changed after creation of the array. The increase of dimensions/length of dimension is never required, but there can be situations when I will set to zero a whole "slice" (whole dimension), if the amount of zeroed dimensions is high it may be advantegeous to "compress" the array. This is not a "need to be" thing, it will be always possible to realize it making a compressed array and copying elements to it.
4. It will be also not bad to have some method to efficiently set all elements of array to some value (0 or 1, for instance).
I would like to use only managed code. I have already read "Arrays Undocumented" article at http://www.codeproject.com/dotnet/arrays.asp
I would be glad to here any opinions on this topic, in particular answers on my questions. Thank you very much for your help.
Andrew Kuklin.

Ankur_Verma · October 23rd, 2004, 06:57 AM

Your problem has been solved in the new specification of C# with the introduction of generics. but i guess you will not be able to use it until .net 2005 final release gets launched. for now post some code to help me think of something similar in the current version of .net.

Utwig · October 26th, 2004, 04:34 AM

2(Ankur_Verma)
Thank you very much for your answer, I can supply the following extra information:
1. I test the perfomance with following code:

Code:

    
//Create a test 10-dimensional array
int[] allDimensions = new int[10]{4, 7, 8, 6, 8, 2, 6, 7, 2, 7};
System.Array testArray = Array.CreateInstance(System.Type.GetType("System.Double"), allDimensions);            

//currentDimensions is an array representing "counters" of dimensions
//I'll explain this on example. If as tempTable we have a three-dimensional array
//with allDimensions=[4, 8, 2].
//Then currentDimensions will run through the following combinations:
//[0; 0; 0], [1; 0; 0], [2; 0; 0], [3; 0; 0], [0; 1; 0], [1; 1; 0], [2; 1; 0] etc.
int[] currentDimensions = new int[allDimensions.Length];
//done variable checks whether all "places" of the space were visited 
//(i.e. with the same example of three-dimensional array bool will be true when 
//currentDimensions=[3, 7, 1]
bool done=false;

while(!done)
{
    //Increment the current place in the space 
    double curValue = (double)testArray.GetValue(currentDimensions);
    testArray.SetValue(++curValue, currentDimensions);

    //Try to get the next combination of currentDimensions
    done=true;                
    for(int j=0; j<currentDimensions.Length; j++)
    {
        //If it is possible to increase the currentDimensions
        //(in the limits of allDimensions) - do it
        if(currentDimensions[j]<allDimensions[j]-1)
        {
            currentDimensions[j]++;
            done=false;

            //Set all previous dimension in currentDimensions to 0
            for(int k=0; k<j; k++)
            {
                currentDimensions[k]=0;
            }
            break;
        }
    }
}

I tried to comment it as much as possible. If something is not clear - do not hesitate to ask.
2. I have downloaded the beta version of Visual Studio .NET 2005 and tried the same code. The result - almost no difference. Around 5-10%. May be this will be changed in the final release, but I'm not sure that I can rely on that. The problem is (I assume) in the line

Code:

double curValue = (double)testArray.GetValue(currentDimensions);

Neither Visual Studio .NET 2003 nor Visual Studio .NET 2005 compile this without explicit cast to (double). (I wonder than why do I need to set the type in

Code:

System.Array testArray = Array.CreateInstance(System.Type.GetType("System.Double"), allDimensions);

)
3. I can also provide some information with memory allocation problems that I have with large-dimensional array (>12 dimensions) if you are interested in discussion.
And again, thank you for your help.