Determining the type of of inputs of a method

Imagine a method m(Texpr x) declared in some class C, for which T3 has to generate inputs; here Texpr is a type-expression. Using reflection, it is possible to figure out the compile-time Texp, without loss of information. To generate a instance of x, T3 needs to determine at least one concrete class D, which is a subtype of Texpr.

A type expression T1 is a subtype of another type expression if all instances/values satisfying T1 also satisfy T2. Notation: T1 ⊆ T2. This relation is reflexive and transitive.

There are several things that make the problem more complicated than it looks. Firstly, finding a concrete class D may not be sufficient. If this D is a generic, we may need to specify instantiations for its type variables.

Secondly, C may be a generic, e.g. C(tvar). If Texpr contains a free occurence of tvar, then we cannot just arbitrarily instantiate it. It should be the same instantiation we have chosen when we created a target instance of C. So, here is the more precise formulation of the problem:

Given a list Σ of type bindings, and a type expression T, a solution of T is a by-tool instantiable type expression T0 such that T0 is a subtype of subst(T,Σ).
A type expression T0 is concrete, if it does not contain a type variable, wildcard, an asbtract class, nor an interface.
A concrete type expression T0 is by-tool instantiable, if it can be instantiated by a tool such as T3. We will consider the following cases to be instantiable:

T0 specifies a primitive type.
T0 specifies a non-asbtract class with a public constructor or a public creation method.
T0 specifies an array.
Special cases, explained later.

A type binding is a pair varname←U, where varname is a name of a type variable, U is a type expression, assumed to be concrete and by-tool instantiable. subst(T,varname←U) replaces every free occurrences of a type variable in the type expression T with the specified name with U. If Σ is a list of type bindings, subst(T,Σ) specifies a simultaneous substitution of all bindings in Σ on T.

Notice if a solution T0 is found, then any by-tool instantiable subtype of T0 is also a solution.

On Java's isAssignableFrom, imprecise on generics

To check for subtyping at the runtime, the only help-function available to us is the method isAssignableFrom from the class Class. Unfortunately, this method operates on so-called raw types, rather than on type expression.

Given a type expression T, its raw version is obtained by replacing any type parameter with the type Object. For example, List(tvar) and List(Integer) would then reduce to List(Object), which in Java is denoted by simply List. Note that deeper parameters would simply gone. E.g. List(List(Object)) reduces to List(Object).

Obviously, isAssignableFrom is not precise enough for our purpose. For example:

interface I<T> { }
class Y extends I<Integer>{ }
void m(I<String> x) { ...}

Obviously, an instance of Y is anot a valid input for m. Yet we won't be able to infer this using only isAssignableFrom. With the method, we can only check: I.isAssignableFrom(Y), which would give true. The answer is in itself correct; but we posed a wrong query. We should asked for: I(String).isAssignableFrom(Y), but such a query is currently not possible.

Instantiating a parameterized class: algorithm solveClassTyvars

Suppose the algorithm solve below produces a type expression T0 as a solution, specifying a class C. It follows that this class is by-tool-instantiable. Next, T3 needs to produce an instance of it. If C has no parameter, this is quite straight forward.

However if C has type parameters, to instantiate the class, we also need to decide instantiation of its parameters. However, this can be more complicated that it looks. There are two possibilities:

T0 is of the form JTFun C [U1,U2,..], then it already specifies the instantiation. We return the following Σ: [tvar1 ← U1, tvar2 ← U2, ... ].
T0 is just JTFun C []. This is a bit complicated. Consider this example:
```
public class C < U extends T, T extends Collection > { ... }
```
Notice how a type-parameter-expression may refer to another parameter. They are not allowed to be cyclic though.
solveClassTyvars(tvar1, tvar2, ...) returns a list of substitutions; it works as follows:
1. It first re-order the tvars, such that if tvar-k refers to tvar-j, then it should appear latter in the ordering.
2. Suppose the new ordering is uvar1, uvar2, ...
3. Σ := ∅
4. do solve(Σ,uvar1) ; solve(Σ,uvar2), ... this will basically solve each tvars, but moreover, the solutions will be added into list of substitutions Σ.
5. return Σ

When an instance of C is created through e.g. a constructor, it may have a parameter of type tvar above. So the Σ needs to be passed on. But only in the scope of the invocation of this constructor.

When the instance of C is actually to be used as the target object, we neeed to pass Σ to the rest of test sequence.

Algorithm solve

The algorithm solve(Σ,T) takes a set of type substitutions Σ and a type expression T.

A substitution is a pair name←U where name is the name of a type variable, and U is a type expression that should not contain any type variable nor wildcard.

The algorithm tries to find a solution T0 for T as defined above; null if no such solution is found. Such a T0 is a by-tool instantiable type expression. We assume T3 knows how to create an object instance of such a type expression.
Due to the inaccuracy of isAssignableFrom (see above), the algorithm may give a T0 that is actually not a solution.
The algorithm has side effect on Σ which we may (or may not) need to pass on.

solve(Σ,T) is defined below:

If T specifies a primitive-like type (int,Integer,etc) : take T0=T.
Else, if T specifies a String : take T0=T.
Else, if T specifies an enumeration type, take T0=T.
T specifies an unparameterized class or interface (T = JTfun C []). Find a by-tool instantiable class C, such that T.isAssignableFrom(C). Take T0 = C. Note that C may have type-parameters, which may then need to be instantiated as well; this is handled by solveClassTyvars.
Else, if T = JTfun X [U] (for simplicity, assume we only have one paratemeter). We can't always handle this case soundly. Let U0 = solve(Σ,U). Find a by-tool instantiable class C, such that X.isAssignableFrom(C). Take T0= JTfun C [U0].
Because the problem with isAssignableFrom as mentioned above, we can't guarantee that this T0 is actually a subtype of T.
If T is a type variable tvar:
1. If (tvar←U) ∈ Σ, take T0 = solve(Σ,U).
2. Else, if tvar specifies no upper bound, then take T0 = Object. Add tvar←T0 to Σ.
3. Else, if tvar has a single upper bound: tvar extends U, then take T0 = solve(Σ,U). Add tvar←T0 to Σ.
4. If tvar has multiple upper bounds, we can't always handle this soundly. So, we have this form, e.g.: tvar extends U1 && U2. Let D1 and D2 be the top classes of U1 and U2. Find a class C, such that D1.isAssignableFrom(C) and D2.isAssignableFrom(C). Take T0 = C as the solution.
  Add tvar←T0 to Σ.
  Note that if U1 or U2 turns out to be parameterized, then this solution may not be sound, due to isAssignableFrom issue.
If T is a wildcard:
1. It specifies no upper nor lower bound. Take T0 = Object.
2. It specifies a single upper bound: ? extends U. Take T0 = solve(Σ,U).
3. It specifies a single lower bound: ? super U. Note if U is abstract no an interface, then we have no solution. Else, if it is of the form JTfun D [V] where D is a class name, and [V] is a posibly empty list of parameters:
  1. If JTfun D [V] is by-tool instantiable, take T0 = JTfun D V.
  2. Else, we can't correctly find a solution with just isAssignableFrom, nor with getSuperclass(). We can propose the following. Find a superclass C of D (e.g. using getSuperClass). If C is unparameterized, take T0=C. Else suppose it is parameterized by tvar. Take T0 = JTfun C [U0], where U0 = solve(Σ,U).
4. It has multiple bounds: currently not allowed in Java.

Special case: Functional Interface

Instantiating functional interfaces is currently not generally supported. Generally the approach is as follows. We first include functional interface as being by-tool instantiable. Then when given a T0 which is a functional interface:

We first determine the function type represented by T0.
We find a matching method.
We need to wrap the method in a lambda expression, bind it to an object, then pass the object to where it is needed.

fiGenerator(texpr) {
   ftype = calculateFunctionalTypeOf(texpr) ;
   Method fcandidate = findMatchingMethod(ftype) ;
   Function wrappedFcandidate = p -> {
       try { return fcandidate.invoke(null,p) ; }
       catch(Exception e) { return null ;} 
   } ;
   return  wrappedFcandidate ;	
}

The messy part is in the first step.

For now, we only support instantiation of the interface Function and Predicate.